Preview

Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS)

Advanced search
Vol 34, No 5 (2022)
View or download the full issue PDF (Russian)
7-22
Abstract

Application-specific systems with FPGA accelerators are often designed using high-level synthesis or hardware construction tools. Nowadays, there are many frameworks available, both open-source and commercial. In this work, we attempt to fairly compare several existing solutions (languages and tools), including Verilog (our baseline), Chisel, Bluespec SystemVerilog (Bluespec Compiler), DSLX (XLS), MaxJ (MaxCompiler), and C (Bambu and Vivado HLS). Our analysis has been carried out using a representative example of 8×8 inverse discrete cosine transform (IDCT), a widely used algorithm engaged in, among others, JPEG and MPEG decoders. The metrics under consideration include: (a) the degree of automation (how much less code is required compared to Verilog), (b) the controllability (possibility to achieve given design characteristics, namely a given ratio of the performance and area), and (c) the flexibility (ease of design modification to achieve certain characteristics). Rather than focusing on computational kernels only, we have developed AXI-Stream wrappers for the synthesized implementations, which allows adequately evaluating characteristics of the designs when they are used as parts of real computer systems. Our study shows clear examples of what impact specific optimizations (tool settings and source code modifications) have on the overall system performance and area. It emphasizes how important is to be able to control the balance between the communication interface utilization and the computational kernel performance and delivers clear guidelines for the next generation tools for designing FPGA accelerator based systems.

23-42
Abstract

This article discusses the experience and evaluates the capability of using free, open and internal proprietary EDA tools in the billon gates SoC verification flow initially based on the commercial EDA of the “Big 3”. Article suggests an approach to assessing the suitability of a particular EDA tools for a certain stage of the verification flow based on a formal stage description and requirements for the tool. It presents our internal tools implemented in the company, which are used as an alternative to commercial tools or as unique solutions. Based on the proposed approach, the analysis of the existing automation tools from the point of view of the applicability in the SoC verification flow was carried out.

43-62
Abstract

Security analysis of network programs includes set of reverse engineering tasks of network protocols. Data formats restoring and implemented protocol automaton are the previous task issues. Unlike quite researched problem of formats restoring where there are lots of scientist’s papers, finding out the protocol's automaton program implementation looks like terra incognita and the cornerstone is a protocol state description currently undefined. There are two general ways to retrieve the implemented protocol automaton: an analysis of the network traces and looking into binary trace of the target application. This article offers a second one method. The first aim of the paper is the way to describe a mathematical model of a protocol automaton and a method for projecting it onto an executing application binary code. The second is concept of the protocol state definition and a principle to detect the states transitions based on some "global" binary trace objects, are described. Thirdly, there is suggested a protocol automaton precising manner by in-memory fuzzing based on a "floating" fork-server to manage states transitions. Finally, developed toolset's scheme and experiments on its using with a real VPN client, are shown.

63-76
Abstract

The problem of constructing an algorithm for comparing two executable files is considered. The algorithm is based on the construction of similarity features vector for a given pair of programs. This vector is then used to decide on the similarity or dissimilarity of programs using machine learning methods. Similarity features are built using algorithms of two types: universal and specialized. Universal algorithms do not take into account the format of the input data (values of fuzzy hash functions, values of compression ratios). Specialized algorithms work with executable files and analyze machine code (using disassemblers). A total of 15 features were built: 9 features of the first type and 6 of the second. Based on the constructed training set of similar and dissimilar program pairs, 7 different binary classifiers were trained and tested. To build the training set, coreutils programs were used. The results of the experiments showed high accuracy of models based on random forest and k nearest neighbors. It was also found that the combined use of features of both types can improve the accuracy of classification.

77-88
Abstract

This work is devoted to the development of a library designed to implement compilers. The article contains a description of the library's features and the main points of its functioning. In the course of the work, the generation of parsers using LR(1)-automata was studied and implemented, two auxiliary languages were designed and implemented: a semantic network query language and a language designed to generate executable code. The result of the work is a library for the platform .NET (the library was tested for the C# language), which contains classes that make easier the implementation of source code parsing, semantic analysis, and executable file generation. This library does not have external dependencies, except for the standard .NET library.

89-110
Abstract

Natch is a tool that provides a convenient way of obtaining an attack surface. By attack surface we mean a list of executable files, dynamic libraries and functions that are responsible for input data processing (such as: files, network packets) during task execution. Functions that end up in the attack surface are possible sources of software vulnerabilities, so they should be given an increased attention during an analysis. At the heart of the Natch tool lay improved methods of tainted data tracking and virtual machines introspection. Natch is built on the basis of the full-system QEMU emulator, so it allows you to analyze any system components, including even the OS kernel and system drivers. The collected attack surface data is visualized by  SNatch, which is tool for data post-processing and GUI implementation. SNatch comes with Natch tool by default. Attack surface obtaining can be built into CI/CD for integrational and system testing. A refined attack surface will increase the effectiveness of functional testing and fuzzing in the life cycle of secure software.

111-126
Abstract

The paper discusses the approach to the comparison of intrusion detection systems (IDS) that is based on several independent scenarios and comprehensive testing. This approach enabled to identify the advantages and disadvantages of the IDS based on machine learning methods (ML IDS), to identify the conditions under which ML IDS is able to outperform signature-based systems in terms of detection quality, to assess the practical applicability of ML IDS. The developed scenarios enabled to model the realization of both known attacks and a zero-day exploit. The conclusion is made about the advantage of ML IDS in the detection of previously unknown attacks and the feasibility of the construction of hybrid detection systems that combine the potential of signature-based and heuristic methods of analysis.

127-142
Abstract

With the development of online social networks, the task of identifying users who have a great influence on other participants in social networks is becoming increasingly important. An important source of information is user comments on content created by other users. The paper proposes a method for determining influence based on a bipartite user-comment-content graph. It incorporates information about text messages and the reactions of other users to them. In addition, we propose a method for identifying user communities in such a graph based on common interests. Experiments on data collections from VKontakte and YouTube networks show the correlation between user activity and influence, however, the most active commenters are not necessarily the most influential. Community analysis shows a positive correlation between the size of a community, the number of most influential users in it, and the average influence of community users.

143-162
Abstract

This paper overview and compares various tools for automating resource management in the cloud. Changes in software architecture and development approaches require automation of deployment management processes and further maintenance of software in different environments. Chapter 2 provides a detailed overview of the tools with sample configurations, as well as a breakdown of relevant articles that look at various automation tools and the effectiveness of their implementation. Chapter 3 presents a draft solution for combining orchestrators developed at ISP RAS to obtain a tool with functionality that competitors do not have.

163-170
Abstract

The paper presents the results of the corpus-based research of noun cryptotypes in 20 varieties of English (Englishes). The data for this research collected from Mark Davies’ corpora GloWbE and NOW enabled us to focus on variation in the covert classification of nouns in modern Englishes. A noun cryptotype introduced by Whorf is approached as ‘a covert type of classification of nouns, marked by lexical selection in a syntactical classifier rather than a morphological tag’. The purpose of the study has been to compare and contrast the covert classification of basic 23 emotions in 20 Englishes (64,702 tokens). 20 Englishes have been clustered with the help of Data Mining methods (such as k-means clustering and a self-organizing Kohonen map). There are six clusters that appeared to be corresponding to geographic areas: American cluster (American and Canadian Englishes); Australian cluster (Australian and New Zealand Englishes); European cluster (British and Irish Englishes); Asian cluster (Indian, Pakistani, Singapore, Hong Kong, Malaysian, Bangladeshi, Sri Lankan, and Philippine Englishes); African cluster (Kenyan, South African, Nigerian, Ghanaian, and Tanzanian Englishes); Caribbean cluster (Jamaican English). The correlation coefficients among Englishes in the Asian and African clusters (the Outer Circle in the World Englishes Paradigm of Braj B. Kachru) range from 0.74 to 0.8 due to little contact among the varieties inside these clusters. The correlation coefficients between Englishes in the American, Australian and European clusters (the Inner Circle, Kachru) range from 0.92 to 0.933, which indicates a high consistency of these varieties owing to the long lasting, enduring linguistic contacts.

171-182
Abstract

The article describes a new method for contextual resolution of homonymy based on a centroid-context model (CCM). The proposed method of detecting cases of homonymy in the corpus of texts and its resolution using the CCM model is based on the theoretical concept of phraseological conceptual analysis of texts (FCAT) and unique machine grammar, which is based on a system of inflective classes of russian words. The rigid conformity between the form of presentation of words and their grammatical information laid down in the theoretical concept of inflective classes of words of the Russian language made it possible to create on this basis new classes - classes of words that have the same sets of grammatical features, conforming to their forms of representation in similar contextual environments. When developing this model, the authors proceeded from the following hypothesis: the same sequences of generalized characters of word classes (generalized syntagms) should correspond to the same syntactic structures of various fragments of texts. At the same time, it was assumed that such a hypothesis is true for any syntactic models and can be useful in solving both global and particular problems of text analysis. Using this method, a new solution to the problem of resolving homonymy based on the proposed CCM model was proposed.

183-194
Abstract

Timed-arcs Petri nets are a time extension of Petri nets that allows assigning clocks to tokens. System of dynamic points on a metric graph (DP-systems) is another dynamical model that is studied in discrete geometry dynamics; DP-system combines continuous time and discrete branching events and used, for example, in study of localized Gaussian wave packets scattering on thin structures. In recent works, asymptotic estimates of the growth of the number of points in dynamic systems on metric graphs were obtained. In this paper, we provide a mean to overapproximate the number of different values of timers for a subclass of timed-arc Petri nets by constructing a system of dynamic points on a metric graph and prove overapproximation of the number of timer values by the number of points in the system of dynamic points.

195-204
Abstract

The issues of mathematical modeling of turbulent flows in channels with blowing of different cross-sectional shapes are considered. As a result of a series of computational experiments using the OpenFoam tools, the influence of the channel shape on the realized features of flows is investigated. A mathematical model of conjugate heat transfer for the channels under consideration is proposed. Comparison of the results of numerical simulation with the known experimental data has shown the correctness of the proposed models, schemes and algorithms. The topological features of the structure of viscous compressible heat-conducting gas flow in the channels with mass flow of complex shapes have been studied, including the features of the velocity profiles formed in the outlet sections of the channels, the results of calculating the coefficient of non-uniformity of velocity have been described.

205-214
Abstract

In this paper, a numerical simulation of forced convective heat transfer in a silicon microchannel heat sink has been performed using the OpenFOAM tools. A single-phase fluid - water - was used as a heat transfer medium. The model of microchannel heat sink is represented as silicon substrate with length 10 mm, with rectangular microchannels 57 microns wide and 180 microns deep located along the full length of the heat sink. A comparative analysis in the form of cross-platform verification of the numerical results obtained with data from third-party authors was performed. The analysis of the obtained data has shown a good convergence of the study results and the possibility of using the OpenFOAM package as a computational environment for the numerical simulation of the physical processes occurring in channel radiators.

215-226
Abstract

The article deals with the problem of modeling the icing of a delta-wing of the X-59 demonstrator aircraft. Three variants of grids for 1.1, 3.8, 9.6 million cells are considered. Wing icing is modeled using the iceFoam solver developed as part of the OpenFOAM package. To solve the problem, two grids are used: the first in the outer region, the second for a liquid film near a solid. blockMesh and snappyHexMesh utilities as part of the OpenFOAM package are used to build the gas phase grid. The quality of the gas phase grid, determined by the standard checkMesh utility, meets all the requirements being checked. However, during the automatic construction of the liquid film mesh, cells with unsatisfactory parameters may be formed, which, for example, include the requirement of limited non-orthogonality of the faces. In this regard, a new algorithm for excluding low-quality calculation cells is being discussed. The simulations were carried out for the X-59 demonstrator wing models on a scale of 1:25 and 1:1 for the case of loose ice. For different-scale models, the invariance of the Reynolds and Mach numbers was ensured. At the same time, the constant parameters in the dimensional form were the water content of the air and the median diameter of water droplets. Patterns of ice formation on the upper and lower parts of the wing were obtained. It is shown that the icing regions of different-scale wing models can differ significantly even when the dimensionless similarity complexes for the gas phase coincide. It is concluded that many experimental and calculated results on the icing of small profiles are difficult to transfer to full-scale profiles. The simulation was performed on the ISP RAS cluster using 48 or 96 processor cores.

227-242
Abstract

A three-dimensional mathematical model of coastal water dynamics in the Peter the Great Bay of the Sea of Japan is presented. The system of Navier-Stokes equations is written under the assumption of a hydrodynamic approximation of an inhomogeneous fluid and takes into account turbulent exchange, wind stresses, friction forces, complex topography of the bottom and coastline. On the basis of hydrographic information, a numerical method is used to reconstruct the bottom topography and coastline. The free water surface equation is replaced by a singularly perturbed problem. In the FreeFem++ application software package, a code was written for calculating the hydrodynamic characteristics of coastal waters. During the solution, the finite element mesh is built automatically, the mesh parameters are set by the user. To solve the singular problem, at each time layer, the grid is adapted around the numerical solution with the grid adaptation parameters.

243-250
Abstract

The paper proposes an algorithm based on central differences and FCT correction for solving the shallow water problem. The results of numerical testing were compared with known data. The comparison showed that the proposed algorithm has similar accuracy with other methods. A comparison of the speed of the proposed algorithm and a similar one based on the McCormack method is given. The conclusion is made about the superiority in speed over the McCormack method with the same accuracy.



Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2079-8156 (Print)
ISSN 2220-6426 (Online)