Preview

Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS)

Advanced search
Vol 23 (2012)
View or download the full issue PDF (Russian)
Abstract
At the beginning of 2000 each node of high performance cluster with distributed memory contained processor with single core and each MPI process of parallel application used to utilize all resources of node. At this moment vendors propose microprocessors with multiple cores on chip and few processors on the single board. Using multiple threads in a single node with modern multicore processors allows to increase performance of parallel application due to usage of shared memory and lower overhead. An extension of model for parallel SPMD programs has been developed with ability to use Java threads. The usage of threads in program allows better utilization of the resources of multicore processor. Developed model allows estimate execution time of parallel program with explicit calls to MPI library, where parallel Java threads could be used in each process. However, there are a lot of problems arising when threads have been used in Java environment. This paper contains recommendations called for performance tuning of multiprocessed-multithreaded program concerning to JVM memory management, garbage collector configuration, management of local buffers etc. Java version of parallel application FT (Fast Furier Transformation) from NPB has been adapted for multiprocess-multithreaded environment. Tests on implemented application show 9-14% performance improvement. Model of multiprocess-multithreaded application has been developed. Performance prediction for multiprocess-multithreaded FT shows 3-7% prediction error.
Abstract

Implementing algorithms for field-programmable gate arrays using a hardware description language is a complex task. Therefore, it would be useful to have a tool that can efficiently translate an algorithm from a high-level language to a hardware description language. In this paper we consider the C-to-HDL which can translate C functions into Verilog modules, the translation process, and two important optimizations implemented on hardware description level: assignment inlining and loop pipelining. The basic structure and ABI interface of C-to-HDL follows that of the C-to-Verilog translator which inspired us for creating our tool. We also use LLVM infrastructure for translating LLVM bitcode representation into the Verilog code. Simple arithmetic operations are executed within a single cycle, while for complex arithmetic operations and loads/stores from/to memory, first, a set of assignments loading instruction operands, memory address, and memory access mode (read or write) is generated and placed on the first cycle of executing the instructions, and second, the final assignment transferring the operation result to a virtual register is generated.

The following optimizations are implemented and greatly improve the execution performance. Instruction scheduling is performed, taking into account that the number of instructions executed in parallel is only limited by the FPGA free chip space; the memory operations have limits given by the number of memory channels and blocks on the FPGA. If possible, assignments of temporary registers are inlined to make the more complex operations, but without producing too long dependence chains and unwanted limiting of the FPGA frequency. Finally, software pipelining following the usual modulo scheduling scheme is also performed with the same resource constraint limit relaxations as described for instruction scheduling.

Experimental results demonstrate that these optimizations significantly improve (up to 4x) the performance of generated code.

Abstract
Obfuscation algorithms are now widely used to prevent software reverse engineering. Binary code virtualization is one of the most powerful obfuscations technics. Another obfuscation method known as “dispatching” can be used to transform application control flow similarly to virtual machine insertion. Our research was aimed at reconstruction of control flow graph in case of both code virtualization and dispatching. To achieve this goal, we implemented de-obfuscation tool which keeps track of virtual program counter used by virtual machine emulator and reconstructs the application control flow. This paper describes experimental results of test application de-obfuscation via dynamic analysis. Both obfuscating and de-obfuscating tools were independently developed by two different teams of ISP RAS – the LLVM-based obfuscating compiler and the software environment for dynamic analysis of binary code. The paper briefly introduces both software tools and then describes results of experimental research on recovering of control flow graph of obfuscated application. Application was initially protected by specialized obfuscating LLVM-based compiler. Next, TrEx environment was used to analyze program execution trace, to find the dispatcher-protected part of application and to recover its control flow. Additionally, some software code complexity metrics for test applications were calculated to estimate obfuscation resilience provided by different versions of obfuscating compiler.
Abstract
Nowadays protecting intellectual property rights for created software is very important because of many competing companies and ubiquitous piracy. The powerful technique for this protection is to obfuscate program code. This paper describes an approach for developing an obfuscation tool based on a compiler infrastructure. Using the compiler infrastructure is beneficial as during compilation we have the complete information about the compiled program for free and we do not need to solve problems like correct disassembling, new target support etc. We have formulated the efficiency criteria for an obfuscating technique: an obfuscation transformation should obfuscate both control flow and data flow simultaneously. The main feature of the described approach is using the set of transformations, which mask the various aspects of the obfuscated application. Together, this set provides the strong protection against the static analysis. Opaque predicates insertions, flattening and increasing indirection transformation are the most efficient transformations. Also we have formulated the compiler infrastructure requirements that need to hold for successfully implementing an obfuscating compiler using this infrastructure. The essential requirements are complied with LLVM (low level virtual machine) compiler infrastructure.
Abstract
The paper describes the obfuscating transformations, which were implemented while developing an LLVM-based obfuscating compiler in ISP RAS.  The proposed transformations are based on well-known obfuscation algorithms and are specifically improved to resist better to static analysis deobfuscation techniques.  The application performance decrease estimation and the increase of application memory consumption estimation are presented. Also, the possibility of source code information recovery is estimated. The implemented obfuscating transformations can be applied together to the given application to provide the strong protection from the static analysis deobfuscation attacks.
Abstract

This paper presents a method for raising the level of abstraction of a program execution trace by building of an algorithm model.

Dynamic analysis of executable files is often used to understand the logic of a program in the absence of source code. One of dynamic analysis methods is analysis of program execution traces containing register values and sequence of instructions executed by the processor. Traces are difficult for understanding because of the large amount of information.

First stage of building an algorithm model is identification of function calls in the trace. Input and output parameters are determined for each call.  If the trace has got several calls for the same function, the information about them is combined, defining low-level parameters, return value, and dependencies between inputs and outputs.

Second stage is variable recovery. Low-level data elements are mapped on variables. Each processor has a fixed set of registers and limited memory and the number of higher-level variables in the program is not limited. Variable lifetime is evaluated for mapping variables to their locations. Lifetime is a range of trace step indexes from variable creation to its last usage. Return value and parameters of the function are recovered using its calling convention.

Third stage is examination of library calls that are used in the trace. Symbolic information can be extracted from binary libraries and added to the corresponding functions in the trace. Header files are available for some libraries. Full high level function prototypes can be found from them and mapped to the trace.

This allows us to get high level types of parameters that are propagated along the trace through global variables and function calls.

Function models are combined into a high level model algorithm that can be used to restore or analyse it. 

Abstract
The paper describes refactoring technics used in Klocwork Insight toolkit for C/C++ programming languages. Being the most popular Extract Function refactoring is chosen to describe all stages of refactoring process. Different language elements and constructions are processed during extraction of a new function and most of them are included as examples. Additionally it's shown that extract function refactoring has to use data-flow analysis to resolve conflicts or different ways of data usage on different conditional branches. Furthermore refactoring has to change and move declarations of variables. So it has to resolve type conflicts and implement declaration order of used types in case type is declared inside function. There are also some points about syntax tree transformations. Such transformations can be automatically mapped to source code changes. Automatically modifying the program’s source code, based on the changes in the syntax tree is commonly known as code transformation, and can be used not only for refactoring purposes. The technic of code transformation used in our approach can be used as a key idea for a new tool for doing non-trivial transformations of source code.
Abstract
The article describes a way to implement syntax analysis for C/C++ languages which allows to reduce significantly the required resources. The method is based on the fact that each source file includes lots of header files from which only a portion of definitions is used. The analysis of definitions in headers can be postponed to the moment of actual access to them, thus avoiding the analysis of definitions that are never used. Such definitions as templates, classes, functions could be skipped during code parsing until access of such definitions will be found in source code. For these purposes parser of C/C++ code modified to support postponed parsing of skipped parts along with revision system like caching of semantic information for parsed code. As the result of implementation of such parser performance growth of source code parsing up to 36% observed on several open source projects. The distinctive feature of this method is that only a small amount of changes is required to add the "lazy" behaviour to an existing parser.  This method of source code parser performance improvement is implemented in the Klocwork Insight static analyzer.
Abstract
Big data challenged traditional storage and analysis systems in several new ways. In this paper we try to figure out how to overcome this challenges, why it's not possible to make it efficiently and describe three modern approaches to big data handling: NoSQL, MapReduce and real-time stream processing. The first section of the paper is the introduction. The second section discuss main issues of Big Data: volume, diversity, velocity, and value. The third section describes different approaches to solving the problem of Big Data. Traditionally one might use a relational DBMS. The paper propose some steps that allow to continue RDBMS using when it’s capacity becomes not enough. Another way is to use a NoSQL approach. The basic ideas of the NoSQL approach are: simplification, high throughput, and unlimited scaling out. Different kinds of NoSQL stores allow to use such systems in different applications of Big Data. MapReduce and it’s free implementation Hadoop may be used to provide scaling out Big Data analytics. Finally, several data management products support real time stream processing under Big Data. The paper briefly overviews these products. The final section of the paper is the conclusion.
Abstract

Nowadays it has become clear that no data store is all-purpose. SQL-based systems have been prevalent for several decades though this approach leads to serious issues in modern high scalability architectures. Thus, a number of different data stores have been designed to fulfill the needs of large-scale Web applications. Moreover, a single application often tends to use multiple data stores for different purposes and data. This paper covers common approaches to data persistence implementation and data store abstraction and also reveals basic considerations for design of a multi-store persistence layer for scalable Web applications. The paper consists of five sections. The first section (Introduction) discusses a problem of data management in the context of modern approaches to Web-applications design and development. The second section describes the most popular solutions of this problem based on traditional SQL-based DBMSs. Some limitations of these solutions are considered. The third section of the paper presents an overview of adventures and limitations of approached based on NoSQL DBMSs. A proposed novel approach of combine use SQL-based and NoSQL systems to develop and run Web-applications is a theme of the forth section. The final section concludes the paper.

Abstract

One of the most important ways of increasing the speed of the modern databases is to cache frequently used data in RAM. Classical replacement policies are intended to minimize the number of buffer pool faults. This optimization method implicitly relies on the fact that the speeds of reading and writing to the hard disc are equal. Gradual technology improvement and cost reduction of flash memory have led to the creation of solid-state data storages (SSD) that are now increasingly used in personal computers and storage systems. Flash drives have advantages over traditional hard drives, high read and write speeds and significantly small time of random data access are the most important of them. However, the most popular flash-memory types read data at a higher speed than write it. Due to this feature the use of classical replacement algorithms of disk data caching is ineffective. This paper reviews recently developed algorithms of database buffer pool management designed to work with flash memory drives: CFDC (Clean First – Dirty Clustered), CASA (Cost-Aware Self-Adaptive), SAWC (Self Adaptive with Write Clustering), and FD-Buffer. Some of these algorithms demonstrate significant advantages over the classical algorithm LRU.

Abstract

The presented overview is concerned with lexical query optimization and covers papers published in the last four decades. The paper consists of five sections. The first section – Introduction – classifies query optimization techniques into semantic optimizations and lexical optimizations. Semantic optimizations usually relies on data integrity rules that are stores within metadata part of databases, and on data statistics. This kind of optimizations is discussed in many textbooks and papers. Lexical optimizations (more often called rewriting) use only a text of query and no other information about database and its structure. Lexical optimizations are further classified into query transformations, query amelioration, and query reduction. The second section of the paper discusses techniques of query transformation such as predicate pushdown, transformation of nested query into query with joins, etc. Query amelioration is a topic of the third section with a focus on magic set optimizations. The forth section covers query reduction optimizations. The section briefly describes traditional approaches (such as tableau -based) are briefly described and considers in more details three new algorithms proposed by authors. The fifth section concludes the paper.

Abstract
Topic modeling is a method for building a model of a collection of text documents. The model is able to determine topics for each of documents. Shifting from term space to space of extracted topics helps resolving synonymy and polysemy of terms. Besides, it allows for more efficient topic-sensitive search, classification, summarization, and annotation of document collections and news feeds. The paper shows an evolution of topic modeling techniques. The earlier methods are based on clustering. These algorithms use some similarity function defined on two documents. The next generation of topic modeling techniques is based on Latent Semantic Indexing (LSA). Words co-occurrences in documents are analyzed here. Currently, the most popular are approaches based on Bayesian networks — directed probabilistic graphical models which incorporate different kinds of entities and metadata: document authorship, connections between words, topics, documents, and authors, etc. The paper contains a comparative survey of different models along with methods for parameter estimation and accuracy measurement. The following topic models are considered in the paper: Probabilistic Latent Semantic Indexing, Latent Dirichlet Allocation, non-parametric models, dynamic models, and semi-supervised models. The paper describes well-known quality evaluation metrics: perplexity and topic coherence. Freely available implementations are listed as well.
Abstract
Вased on UniHUB technology by ISP RAS, Web-laboratory (hub) for remote sensing data integration is currently being deployed. Huge amount of remote sensing data is currently available. Data are being gathered by separate researchers and scientific projects as well as by complex global databases of major international programs and information centers. Thus we observe broad migration of geographic data, services and applications to network environment, particularly to Internet. But our information system focuses not only on data collecting but mainly on creating problem-oriented community of scientists (i.e. experts, researchers, students) online. It provides a search mechanism for sources of spatial data via graphical interface, data access, web applications and training materials. In context of UniHUB our Web-laboratory exists as a user group. Anyone can easily contribute new materials to it as well as take advantage of assistance from competent colleagues. Cloud environment of UniHUB is used to solve geographic problems, such like handling of satellite images and digital elevation models by methods and techniques of spatial analysis and geomodeling in open source Quantum GIS. Currently our Web-laboratory is widely used among students. But we expect it to have much more applications such like assessments in the field of natural resources, state of environment etc.
Abstract
In this paper we consider a method for extraction of alternative names of a concept or a named entity mentioned in a news cluster. The method is based on the structural organization of news clusters and exploits comparison of various contexts of words. The word contexts are used as basis for multiword expression extraction and main entity detection. At the end of cluster processing we obtain groups of near-synonyms, in which the main synonym of a group is determined.
Abstract
This paper describes research-in-progress work on data integration system in Linked Open Data space. Proposed system uses concept of RDF identity links to interlink heterogeneous local data sources and integrate them into global data space.
Abstract
There are the following techniques that are used to analyze massive amounts of data: MapReduce paradigm, parallel DBMSs, column-wise store, and various combinations of these approaches. We focus in a MapReduce environment. Unfortunately, join algorithms is not directly supported in MapReduce. The aim of this work is to generalize and compare existing equi-join algorithms with some optimization techniques.
Abstract
Temporal data regularity is an important data property that can be used in different applications. Such regularity is explored in the field of periodic pattern mining. In this paper, we raise a problem of periodic sets detection and suggest the method for its solution. The existing algorithms for the periodic event mining are considered in detail and a new approach is proposed in the paper. The comparison of the algorithms and their performance are demonstrated through a series of experiments.
Abstract
The paper discusses the problem of dependency between errors defined by specification and the related problem of test optimization. There is a dependency between errors if a strict subset of errors exists such that any nonconforming implementation (i.e. an implementation containing an error) contains an error from this subset. Accordingly, it is sufficient for the tests to detect errors only from this subset. The most general formal model of test interaction and the reduction type of conformance are suggested, for which dependency between errors is almost absent. Most of the known conformances in various interaction semantics are demonstrated to be special cases of this general model. In this general model, the dependency between errors may occur when any strict subset of the class of all implementations is chosen as a class of implementations under testing. Particular interaction semantics and/or various hypotheses on implementations (specifically, the safety hypothesis), in fact, assume that the implementation under testing should belong to some subclass of (safe) implementations.
Abstract
tion of covering arrays, that is ensuring coverage of all pairs, triple, etc. of configuration parameters values. The method combines known optimal algorithm for binary pairwise coverage array generation with further greedy construction of non-binary part of the array. Novelty of the method proposed is taking into account parameters usage conditions, that is constraints on parameters values, which should be obeyed to force the operating system under test to use the value of certain parameter. Usage conditions require to change both accounting of tuples covered and test construction process, which includes now the parametres assignments making all the necessary constraints to hold. Construction of such an assignment uses satisfiability check based on well-known Aspvall-Plass-Tarjan algorithm. The method proposed is applied in configuration test generation for real-time operating system with several hundreds configuration parameters, the results of this application demonstrate effectiveness of the method — usually several seconds is enough for generation of up to thousand different configurations (taking in account only analysis and generation, without input and output phases) where dozens of usage conditions are satisfied.
Abstract
Querying source code helps developers to discover code fragments of interest and to determine their interrelations with each other. Different approaches exist to execute source code queries automatically. Some of them are based on a rather simple text pattern matching. More advanced approaches provide abilities to use natural languages for queries and perform matching on the basis of a formal program representation. We suggest using aspect-oriented programming for querying source code and consider advantages and disadvantages of this approach. The paper introduces main conceptions of aspect-oriented programming and briefly presents C Instrumentation Framework – an aspect-oriented programming implementation for the C programming language. We slightly extended C Instrumentation Framework so that it could execute source code queries written in form of aspects. The paper gives several examples of utilization of the suggested approach in particular for collecting information required for generating environment models for device drivers in the Linux Driver Verification project. The suggested approach is inferior to special querying tools in making queries by control and data flows. But it is simple in implementation and allows one to use traditional means of aspect-oriented programming for developing and executing source code queries.
Abstract
Despite the fact that TLS and its predecessor SSL are in use for more than 15 years, there are no accepted public conformance test suite for those protocols. Implementers do have their own test suites, but the primary objective of those tests are internals of the corresponding implementation rather than conformance against the standard. Furthermore, such tests are too much implementation specific to be used to be used for implementation other than they were developed for. In general it is impossible to test a TLS/SSL implementation with tests from another implementation. The paper presents an approach to conformance testing of TLS/SSL server implementations. The approach is based on Model- Based Testing methodology. A test suite was developed and propped against a number of open TLS/SSL server implementations. The developed approach to conformance testing is based on automated testing against formal specifications. Requirements in the text specification are statements in a natural language, describing desired behavior of TLS implementations. Following the approach used we elicited more than 300 functional requirements to server implementations and formalized in the specification extension of Java language. The language used is part of UniTESK family of MBT technologies. Using the extension, the protocol model is presented as a contract specification with pre- and postcondition defining constraints on the allowed behavior. The specification is backed with a technique for on-the-fly test construction provided by UniTESK tools. We tried the test suite against three open implementations of TLS. Testing revealed discrepancies with the standard in all three implementations, and one implementation suffers from violation of a critical requirement. The work proved that the proposed approach to verification, based on protocol modeling with contract specifications, provides efficient automation of protocols as complex as security protocols. Furthermore, thanks to formal models, the tests have well-defined and trackable coverage criteria that greatly enhances quality of testing.
Abstract

Linux driver verification is a large application area for software verification methods, in particular, for functional, safety, and security verification. Linux driver software is industrial production code — IT infrastructures rely on its stability, and thus, there are strong requirements for correctness and reliability. Linux driver software is complex, low-level systems code, and its characteristics make it necessary to bring to bear techniques from program analysis, SMT solvers, model checking, and other areas of software verification. These areas have recently made a significant progress in terms of precision and performance, and the complex task of verifying Linux driver software can be successful if the conceptual state-of-the-art becomes available in tool implementations.

The paper is based on experience of the research groups led by authors in verification of industrial software. It is important to develop verification tools that are efficient and effective enough to successfully check software components that are as complex as device drivers. In this area verifiers/researchers and Linux society find mutual benefits in cooperation because: for the society it is important to get such crucial software verified; for the verification community it is important to get realistic verification tasks in order to tune and further develop the technology. The paper provides an overview of the state-of-the-art and pointed out research directions in which further progress is essential. In particularly the paper considers most promising verification techniques and tools, including predicate abstraction, counter example generation, explicit-state verification, termination and concurrency analysis. One of main topic of Linux driver verification research is combination of verification techniques.

Abstract

The paper investigates issues of Linux file system driver testing. Linux file system drivers are implemented as kernel modules, which works in the same address space as kernel core. For that reason, the driver should be very reliable. It should react adequately to incorrect file system images, to faults in operating system, etc. Also drivers should free all resources it requests as far as kernel have to work for long time without restart. Another important feature of file system drivers is that they are programs with multiple entry points, which may be executed simultaneously in respect with each other and with other kernel code.

Most of existing test systems verify only driver's behavior in normal situations by calling system calls, which are eventually dispatched by the kernel into the driver's functions. Some of them also check driver in concurrent scenarios but only at very basic level. Some test systems also verify driver's behavior under faulty environment, when request for memory allocation or disk read/write may fail. But fault scenarios used in those systems are probabilistic, which leads to problems with tests' reproducibility and requires to repeat tests many times to achieve better coverage. There is one test system, which checks for memory leaks in the driver under test.

The paper concludes by statement of requirements for more throughout file system driver testing. According to the requirements a test system have to cover the following aspects:

  1. Normal scenarios on system calls level.
  2. Parallel scenarios with additional checks for data races.
  3. Fault scenarios with insufficient memory and faulty block devices using such techniques as fault injection.
  4. Handling of incorrect file system images.
  5. Driver testing on system calls with invalid arguments.
  6. Check leaks of memory and other resources requested by driver under test.
Abstract
Suggested a new construction for full homomorphic encryption which doesn’t use an extra information on secret key.
Abstract
The problem of scheduling parallel tasks on a group of geographically distributed clusters optimizing energy-efficiency of computations is considered. Some scheduling algorithms are proposed and experimentally investigated. Methods for reducing the energy consumption of a uniform computer cluster due to flexible control strategies of the node states (waking them up or shutting down) and of the execution order of the awaiting tasks are considered. A software system developed in the Institute for System Programming of the Russian Academy of Sciences (ISP RAS) for the dynamic control of  nodes in order to reduce the energy consumption is described. Several strategies for controlling the states of the nodes are proposed and investigated. Our simulation showed that when the average density of tasks is 0.5, the energy saving is about 10%. When the density of the flow of tasks decreases, the effect of using the proposed system drastically increases: when the average density is 0.3, the saving is 30%; when the average density is 0.2, the saving is 50%; and when the average density is 0.1, the saving is 70%.
Abstract
A classical problem of scheduling the set of tasks optimizing load balancing for a set of given processors was considered in theory in 1966. Graham's algorithm sending each task to a least loaded machine (processor) was the first example of approximate algorithm with guaranteed constant approximation quality. But all tasks and processors (machines) in Graham's example were simple (only one processor was sufficient for each task). On the other hand Graham's algorithm was an on-line one that is each task was sended to processor without knowing future tasks. With wide use of computational clusters very actual became the problem of scheduling parallel tasks on the set of parallel clusters in the class of on-line algorithms.. This problem has very close relation to a two dimensional packing problem, socalled multiple strip packing problem. Few years ago the author proposed an on-line algorithm for this problem and proved that for arbitrary set of parallel tasks (rectangles) the obtained schedule (packing) is within 2e of the optimal. In this paper the author generalized his on-line algorithm for the case of related parallel machines when all clusters have different speeds of processors in different clusters. An on-line algorithm for scheduling parallel tasks on a group of clusters with different speeds of processors is presented and it was proved that its approximation ratio is 2e for any set of parallel tasks. This result is non-trivial because even the case of one-processor related machines was intensively investigated and some algorithms with constant approximation ratio were presented.
Abstract
To unify a pair of algebraic expressions t1 and t2 is to find out such a substitution θ that both terms t1θ and t2θ have the same meaning. Unification problem can be extended to computational programs. To unify a pair of programs π1 and π2 is to build two sequences of assignment statements ρ1 : x1 := t1; x2 := t2; … xn := tn and ρ1 : y1 := s1; y2 := s2; … ym := sm, such that both compositions of programs ρ1;π1 and ρ2;π2 are equivalent (i.e. they compute the same function). In this paper we deal with logic-and-term equivalence introduced in 1972 by V. Itkin. This is the most weak decidable equivalence on programs that approximates the functional equivalence. Based on the polynomial-time equivalence checking algorithm for logic-and-term equivalence in the firstorder model of imperative programs we introduce a polynomial-time unification algorithm which computes the most general unifier (ρ1, ρ2) for every pair of sequential imperative (π1, π2) w.r.t. logic-and-term equivalence. After discussing the importance of unification proble for program refactoring and optimization we briefly recall the basic concepts of term substitution theory and theory of program schemata. Then we introduce a formal model of sequential programs we deal with, and finally we present our unification algorithm, prove its correctness and show that its complexity is polynomial on the size of programs to be unified.


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2079-8156 (Print)
ISSN 2220-6426 (Online)