Preview

Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS)

Advanced search
Vol 34, No 2 (2022)
View or download the full issue PDF (Russian)
7-16
Abstract

One of the key aspects of the correctness of the memory subsystem of a microprocessor is its functioning in accordance with the memory coherence protocol. This article presents an approach to test program generation for memory coherence verification of “Elbrus” microprocessors. Requirements for memory coherence tests are considered. The memory map structure allowing to describe the memory areas used in tests and the types of accesses to these areas in a flexible way is presented. The method of test program generation based on the memory map structure is described. The method of automatic memory map generation is proposed. Generated tests have been used for verification of RTL models and FPGA-based prototypes.

17-24
Abstract

Automated test coverage is a widespread practice in long-live software development projects for now. According to the test development approach, each automated test should reuse functions implemented in test framework. The provided research is aimed at improving the test framework development approach using natural language processing methods. The algorithm includes the following steps: preparation of test scenarios; transformation of scenario paragraphs to syntax tree using pretrained OpenIE model; test steps comparison with test framework interfaces using GloVe model; transformation of the given semantic tree to the Kotlin language code. The paper contains the description of protype of system automatically generating Kotlin language tests from natural language specification. 

25-42
Abstract

Modern software is rapidly developing, revealing new program errors. More and more companies follow security development lifecycle (SDL). Fuzzing and symbolic execution are among the most popular options for supporting SDL. They allow to automatically test programs and find errors. Hybrid fuzzing is one of the most effective ways to test programs, which combines these two techniques. Checking security predicates during symbolic execution is an advanced technique, which focuses on solving extra constraints for input data to find an error and generate an input file to reproduce it. In this paper we propose a method for automatically detecting errors with the help of dynamic symbolic execution, combining hybrid fuzzing and checking security predicates. Firstly, we run hybrid fuzzing, which is required to increase number of corpora seeds. Then we minimize corpora. Thus, it would give the same coverage as the original corpora. After that we check security predicates on minimized corpora. Thus, security predicates allow to find errors like division by zero, out of bounds access, integer overflow, and more. Security predicates results are later verified with sanitizers to filter false positive results. As a result of applying the proposed method to different open source programs, we found 11 new different errors in 5 projects.

43-56
Abstract

The development and support of knowledge-based systems for experts in the field of social network analysis (SNA) is complicated because of the problems of viability maintenance that inevitably emerge in data intensive domains. Largely this is the case due to the properties of semi-structured objects and processes that are analyzed by data specialists using data mining techniques and others automated analytical tools. In order to be viable a modern knowledge-based analytical platform should be able to integrate heterogeneous information, present it to users in an understandable way and to support tools for functionality extensibility. In this paper we introduce an ontological approach to information integration and propose design patterns for developing analytical platform core functionality such as ontology repository management, domain-specific languages (DSLs) generation and source code round-trip synchronization with DSL-models.

57-66
Abstract

This paper aims at investigating the feasibility of an actor-oriented approach for modelling analytical systems development business processes. The study analyzes existing management challenges of analytical systems development processes, identifies key business process modeling approaches, and proposes a modeling approach based on actor-oriented approach with high flexibility and enhanced control over business artifacts. The article also describes examples of possible applications of this approach in a business process management tool.

67-76
Abstract

Nowadays, in order for a company to remain competitive, efficient and attractive to investors it needs to have reliable and threat-resistant business processes. The question of methods for building such business processes remains relevant. This paper proposes a software system, which involves the use of methods and tools of DSM (Domain Specific Modeling), ontological approach, simulation modeling methods, mass service theory, Petri nets. As an example, the logistics process of ship boarding in the port is considered. Software tools implementing simulation modeling and DSM are ANYLOGIC and METALANGUAGE.

77-88
Abstract

The paper proposes various strategies for sampling text data when performing automatic sentence classification for the purpose of detecting missing bibliographic links. We construct samples based on sentences as semantic units of the text and add their immediate context which consists of several neighbouring sentences. We examine a number of sampling strategies that differ in context size and position. The experiment is carried out on the collection of STEM scientific papers. Including the context of sentences into samples improves the result of their classification. We automatically determine the optimal sampling strategy for a given text collection by implementing an ensemble voting when classifying the same data sampled in different ways. Sampling strategy taking into account the sentence context with hard voting procedure leads to the classification accuracy of 98% (F1-score). This method of detecting missing bibliographic links can be used in recommendation engines of applied intelligent information systems. Keywords: text sampling, sampling strategy, citation analysis, bibliographic link prediction, sentence classification.

89-110
Abstract

The work is devoted to improving the quality of the results of image segmentation of documents of various scientific articles and legal acts by neural network models by learning using modified loss functions that take into account the features of images of the selected subject area. The analysis of existing loss functions is carried out, as well as the development of new functions that operate both with the coordinates of the bounding boxes and using information about the pixels of the input image. To assess the quality, a neural network segmentation model with modified loss functions is trained, and a theoretical assessment is carried out using a simulation experiment showing the convergence rate and segmentation error. As a result of the study, rapidly converging loss functions were created that improve the quality of document image segmentation using additional information about the input data.

111-122
Abstract

This work is devoted to the research and development of a task management system for automated data collection from the Internet. This article contains a description of the implemented methodologies and tells about the techniques created by interacting with containers containing data collection applications. In the course of the work, various existing services for automated data collection from the Internet were studied and presented: ready-made open source solutions, cloud services with extensive functionality, as well as our own solution running Kubernetes. As a result of the work, a task management system was implemented for Talisman data analysis platform, which provides horizontal scalability, isolation of the crawler environment and independence from the technology of their development.

123-134
Abstract

The work is devoted to the study of automation tools for managing stateful applications in the Kubernetes environment, particularly object storage systems. A review of existing management tools capable solving the set tasks is made. A comparative description of the considered tools based on review is given and a tool is selected that meets the introduced criteria: popularity, support form Kubernetes, reactivity of developed operator, additional features, and others. An approach to automatic object storage management using the Operator SDK and Custom Resource Definition is suggested. As a result of comprehensive comparative analysis of tools Kubebuilder, Juju, Metacontroller, Kudo and Operator SDK, the last one was chosen as a base of approach implementation. The architecture of the system for managing a containerized version of storage systems based on the Kubernetes platform and integrating the operator with a user monitoring system is proposed. The described approach is implemented in a software tool – an operator of the object data storage system resource. The paper describes the details of software implementation, the structure of the storage custom resource descriptor, and methods for testing the end system. As a result, an object storage management system based on the Kubernetes platform was created, which made it possible to reduce both labor costs for supporting and maintaining the system, and it’s cost by reducing dependence on hardware. Moreover, described approach corresponds to such features of modern object storages as multi-tier, erasure coding support, geo-replication, cluster topology that is quite innovative among existing automated storage management approaches on Kubernetes platforms.

135-144
Abstract

Using Received Signal Strength Indicator (RSSI) values to detect human presence is a well-known Wi-Fi sensing technique. In this paper, an overview of existing algorithms solving the problem is considered. Two new techniques based on the discrete Kolmogorov-Wiener filter and the gated recurrent unit neural network are proposed. Human detection experiment results are presented along with algorithms' accuracy analysis.

145-158
Abstract

For organizations that execute control over specially protected natural areas of the Russian Federation, the task of consolidating data on ongoing observations is relevant. These data, called the chronicles of nature, for a long time were kept in a simplified, paper form and did not have a clear structure. The task of automating business processes for collecting data and exchanging this data between members of the scientific community, as well as building models necessary for the scientific departments of parks and reserves, is important and relevant. In this connection, we see it relevant to consider automating the process of environmental monitoring by developing an electronic document management module based on the integration platform Directum RX business solutions. The purpose of the research is to automate the eco-monitoring process based on the Directum RX platform, which allows to build a corporate content management system, as well as create a full-fledged data storage and retrieval system for collecting information about observations. The article describes the role model for working with the system, the system architecture and the developed components of the "Ecomonitoring" module based on the Directum RX platform. For this, a structural method was used, by dividing the task into many independent stages available for understanding and solving and hierarchical ordering. Integration with the Yandex weather service has been developed for further use in analytical models. A solution has been developed to manage the universal classifier of animals in accordance with the classifier of biological taxonomy accepted in the world. The automation of user actions for collecting and processing information about observations is demonstrated. As a result, the process of environmental monitoring was automated in one of the reserves of the Russian Federation.

159-178
Abstract

The Slurm-VNIITF software developed by Federal State Unitary Enterprise “Russian Federal Nuclear Center - Zababakhin All-Russian Research Institute of Technical Physics”, it’s architecture, resource management capabilities and task management for numerical simulation HPC systems described in this paper. During many years usage of the HPC systems researches show that the basic features of the Slurm (Simple linux utility for resource management) software are clearly insufficient for the effective use of computing resources in HPC centers. Therefore, the authors of this paper propose an improved task and resource management policy. Slurm extension modules (plugins) for implementing this policy also described in this paper.

179-190
Abstract

The paper is devoted to the issue of blood donation and possible ways to promote this activity using modern information technologies. Existing software solutions are analyzed and new Web application is proposed to implement all features required for potential blood donors to make this process clearer and more comfortable.

191-200
Abstract

Representation of the DNA sequence is possible in various ways. The variation graph is one of the most accurate methods that allows you to work with atypical areas and take into account all their diversity. Based on this data structure and the polygenic risk assessment method, a DNA interpretation system was built. As a result, a correlation coefficient was obtained between the path in the column responsible for a specific DNA sequence and the feature. We then compared it with a coefficient obtained by a similar method but using sequence representation using a reference genome. Such a comparison helped to evaluate the effectiveness of the representation in the form of a graph. After that, a modified method for calculating the polygenic score on the alignment data of the vg tool was built, which was also compared with existing methods. The modified method showed an improvement in the prediction of the trait. 

201-208
Abstract

With the development of modern technologies in medical organizations, there is an opportunity to modernize existing methods of monitoring public health and detecting diseases. The use of telemedicine can reduce costs and increase the efficiency and accessibility of medical services including monitoring the state of health by remote (outside of medical and preventive institutions) registration and processing of ECG that helps to detect diseases in the initial stages. In this paper, we propose an approach to displaying data to users of telemedicine systems for independent (without medical staff) early detection of diseases by ECG. This approach can be used in the development of a graphical interface for telemedicine systems for the early detection of diseases by ECG.



Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2079-8156 (Print)
ISSN 2220-6426 (Online)