Preview

Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS)

Advanced search
Vol 37, No 3 (2025)
View or download the full issue PDF (Russian)
9-18
Abstract

This paper considers the problem of congestion map prediction at the pre-routing stage of VLSI layout design of digital blocks by applying neural network models. Early prediction of congestion will allow the VLSI design engineer to modify floorplan, macro placement and input-output port placement to prevent interconnect routing issues at later stages, thereby reducing the number of EDA tool runs and the overall circuit design runtime. In this work we propose the use of the initial layout parameters, which were not considered in previous works and allow for a more accurate congestion prediction.

19-38
Abstract

A method for synthesizing self-checking digital devices with improved testability indicators is described. The method is based on the concurrent error-detection circuit synthesis by signals Boolean correction and the Hamming code (7, 4) with control of calculations according to two diagnostic criteria. The attributes used are the belonging of code words to the code (7, 4) and the self-duality of each function describing the data and checking bits of the code. The concurrent error-detection circuit basic structure for a seven-output combinational device is giving. The structure uses standard blocks, apart from the Boolean correction function calculation block. An algorithm has been developed for the Boolean correction function calculation block synthesis that meets the conditions for ensuring the self-duality of the generated signals and the belonging of code words (7, 4) to code. The application features of the basic structure are studied. It is shown that as n increases, the technical implementation complexity of individual concurrent error-detection circuit standard components decreases in comparison with the traditional duplication method. However, due to the increase in the complexity of the comparator, the overall complexity indicators of their technical implementation are growing. This leads to a decrease in the growth of the “efficiency margin for structural redundancy” of the proposed method as n increases. Thus, the effectiveness of using the presented method compared to duplication can be achieved with a significant reduction in the complexity of individual Boolean correction function calculation blocks (considering the possibilities for joint optimization of their structures). A preliminary assessment allows to recommend the developed method for special cases of diagnostic objects with a small number of outputs (no more than 30). On a case-by-case basis, effectiveness versus duplication must be assessed. In comparison with duplication in terms of testability, the method turns out to be more advantageous, since it makes it possible to ensure the tests formation more easily for concurrent error-detection circuit elements than when using duplication and makes it possible to achieve their formation even in cases where this is impossible with duplication. The proposed method for synthesizing self-checking devices can be considered when designing highly reliable digital systems on a modern element base.

39-58
Abstract

The authors consider the PIR (Private Information Retrieval) problem to ensure secure requests to a database hosted on the cloud in the presence of an active adversary who does not interfere with the PIR protocol, but can carry out an attack with known open requests. To represent the bit number i as a number, all digits of which are different, the proposed algorithms use a base l number system with the number of digits d. The permutations of the digits of the requested bit number which are regarded as secret encryption keys were used. To reduce the communication complexity the bits of the source database stored on the cloud are grouped as arrays. A pseudorandom number sensor is used to replace the bit value depending on the number i requested by the client. This makes it difficult to match the bit value to a specific number in the case of collusion between a passive adversary located on the cloud and an active adversary outside the cloud. The communication complexity and the probability of guessing the bit number in a single attack with a known open query for bit number i, as well as in an attack with an unlimited number of known open queries, are estimated.

59-68
Abstract

The article is devoted to the development of models of destructive impact on the integrity of machine learning models based on SIR forecasting of the scale of threats and risks of losses under various scenarios of computer attacks. The article presents an original model of information security threats to technical components of artificial intelligence in the context of heterogeneous mass computer attacks, displaying vulnerabilities and methods of possible enemy actions. The authors have developed a methodology for adapting modernized SIR models of natural epidemics to identify similarities and analogues in the nature of the spread of destructive failures in AI systems caused by heterogeneous mass and targeted impacts. The identified patterns made it possible to assess the risks of possible damage to integrity and develop effective strategies for preventing and correcting distortions of machine learning models.

69-84
Abstract

Recently, the area of adversarial attacks on image quality metrics has begun to be explored, whereas the area of defences remains under-researched. In this study, we aim to cover that case and check the transferability of adversarial purification defences from image classifiers to IQA methods. In this paper, we apply several widespread attacks on IQA models and examine the success of the defences against them. The purification methodologies covered different preprocessing techniques, including geometrical transformations, compression, denoising, and modern neural network-based methods. Also, we address the challenge of assessing the efficacy of a defensive methodology by proposing ways to estimate output visual quality and the success of neutralizing attacks. We test defences against attacks on three IQA metrics – Linearity, MetaIQA and SPAQ.

85-106
Abstract

We propose a novel neural-network-based method to perform matting of videos depicting people that does not require additional user input such as trimaps. Our architecture achieves temporal stability of the resulting alpha mattes by using motion-estimation-based smoothing of image-segmentation algorithm outputs, combined with convolutional-LSTM modules on U-Net skip connections. We also propose a fake-motion algorithm that generates training clips for the video-matting network given photos with ground-truth alpha mattes and background videos. We apply random motion to photos and their mattes to simulate movement one would find in real videos and composite the result with the background clips. It lets us train a deep neural network operating on videos in an absence of a large annotated video dataset and provides ground-truth training-clip foreground optical flow for use in loss functions.

107-120
Abstract

The article examines modern approaches to enhancing the performance of computing systems based on the residue number system. The objective of the study is to analyze specific sets of residue number system moduli that allow for key computational operations, such as addition, reverse conversion, and sign determination, to be performed with minimal cost. Experimental results showed that the  basis was the most efficient among the three moduli sets. This basis is promising for use in high-performance computing systems.

121-130
Abstract

The semiconductor diode and grounded gate MOSFET (GGMOS) devices are commonly used as electrostatic discharge (ESD) protection element in CMOS ICs circuitry. This article presents an implementation of ESD diode and GGMOS macro models using open-source circuit simulation tools (Qucs-S and Ngspice). The proposed models could serve for the circuit simulation of the ESD event. Such simulation allows to estimate the ESD robustness of the IC at the early design stage.

131-146
Abstract

The article describes industrial experience of open-source software application during aircraft design process. The article focuses on workflow organization, aerodynamical design and CFD simulation using OpenProject project management system, Gitea version control system, multidisciplinary optimization framework OpenMDAO, parametric aircraft design tool OpenVSP, CFD software OpenFOAM.

147-158
Abstract

This paper presents a study on the selection of the most relevant vector representations for texts in Russian, which are used in the BERTScore metric. This metric is used to assess the quality of generated texts, which can be obtained as a result of solving tasks such as automatic text summarization, machine translation, etc.

159-170
Abstract

In this article, we delve into the task of sentiment analysis applied to news articles covering sanctions against Russia, with a specific focus on secondary sanctions. With geopolitical tensions influencing global affairs, understanding the sentiment conveyed in news about sanctions is crucial for policymakers, analysts, and the public alike. We explore the challenges and nuances of sentiment analysis in this context, considering the linguistic complexities, geopolitical dynamics, and data biases inherent in news reporting. Leveraging natural language processing techniques and machine learning models, including Large Language Models (LLM), 1D Convolutional Layer (Conv1D), and Feed-Forward Networks (FFN), we aim to extract sentiment insights from news articles. Our analysis provides valuable perspectives on public opinion, market reactions, and geopolitical trends. Through our work, we seek to illuminate the sentiment landscape surrounding sanctions against Russia and their broader implications.

171-184
Abstract

Large Language Models (LLMs) are being applied across various fields due to their growing capabilities in numerous natural language processing tasks. However, the implementation of LLMs in systems where errors could have negative consequences necessitates a thorough examination of their reliability. Specifically, evaluating the factuality of LLMs helps determine how well the generated text aligns with real-world facts. Despite the existence of numerous factual benchmarks, only a small fraction of them assesses the models' knowledge in the Russian domain. Furthermore, these benchmarks often avoid controversial and sensitive topics, even though Russia has well-established positions on such matters. To overcome the problem of incompleteness of sensitive assessments, we have developed the SLAVA benchmark, comprising approximately 14,000 sensitive questions relevant to the Russian domain across various fields of knowledge. Additionally, for each question, we measured the provocation factor, which determines the respondent's sensitivity to the topic in question. The benchmark results allowed us to rank multilingual LLMs based on their responses to questions on significant topics such as history, political science, sociology and geography. We hope that our research will draw attention to this issue and stimulate the development of new factual benchmarks, which, through the evaluation of LLM quality, will contribute to the harmonization of the information space accessible to a wide range of users and the formation of ideological sovereignty.

185-194
Abstract

The work is devoted to the study of the cognitive function associated with generating elliptical sentences in the Russian language. This function was tested using an open-source system. The material for testing covers only verbal and nominal ellipses, theoretically fully recoverable on the basis of the context. The texts of planimetric tasks were chosen as testing material. When analyzing the test results, the following facts were revealed: the influence of the respondent’s knowledge in the subject area (planimetry) on the understanding of sentences and on the understanding of the syntactic rules for the construction of ellipses; tendency to self-education of respondents; respondents’ tendency to remove any parts from sentences that they consider redundant. Thus, the cognitive function of the formation of ellipses has an integrative character and includes a linguistic component (syntax), knowledge of the subject area and mental operations of sentence formation. Due to the revealed complexity of the tested function, the task of evaluating the test results also becomes more complicated. The article is devoted to the consideration of various models for assessing the work of respondents both on an integral basis and in relation to each identified component of the cognitive function.

195-210
Abstract

The classification of Samoyedic languages has become one of the most popular topics in Uralistics in recent years, with at least six different perspectives expressed by leading experts, often in contradiction with one another. On the LingvoDoc platform, there are 16 dictionaries and concordances of texts in Samoyedic languages. Among these, 10 dictionaries – Nenets, Enets, Nganasan, and Selkup dialects – were compiled from native speakers, while six others were derived from archival and published sources. They are analyzed using the glottochronology formula developed by S.A. Starostin. The analysis on LingvoDoc results in a 3D graph that depicts the degree of temporal proximity regarding the divergence of Samoyedic languages and dialects. It was determined that, from a glottochronological perspective, there was a certain proximity between Nenets, Enets, and Nganasan, that are traditionally grouped into the North Samoyedic cluster, while Selkup, Mator, and Kamasin, are regarded as South Samoyedic. However, these commonalities existed for a relatively short period. A longer period of unity was observed between Mator and Kamasin languages and between Nenets and Enets. The highest number of words with no etymology in other lists of basic vocabulary was found in Selkup dialects and in the Nganasan language, indicating their prolonged isolated existence. The analysis conducted in this study supports the validity of the traditional classification of Samoyedic languages. Considering the material from early Selkup texts provides more reliable evidence for postulating the South Samoyedic group.

211-224
Abstract

This paper aims to describe certain phonetical, morphological and lexical features of Australian Aboriginal English that have been detected throughout the analysis of Australian Aboriginal English texts in LingvoDoc and Praat. The study outlines the methods, goals, and benefits of using the linguistic platform LingvoDoc to identify and systematize the grammatical and lexical features of Australian Aboriginal English. Numerous researchers note that Australian Aboriginal English is a distinct ethnolect, differing from the English spoken by Australians of British descent. By using LingvoDoc to create a collection of Australian Aboriginal English dictionaries that describe features specific to particular localities in Australia, it is likely we can draw conclusions about correlations between the lexical and grammatical features of this ethnolect and various extralinguistic factors. The texts under scrutiny include transcripts of interviews with Aboriginal Elders, musicians, teachers and artists, song lyrics, and personal stories. Informants originate in various places across Australia and belong to various age cohorts from adolescence to late adulthood. Texts were grouped based on informants’ places of origin, and a separate dictionary for each of those places was created in Lingvodoc. Each dictionary was attached to a human settlement on the world map, which helped us track the correlation between the speakers’ origin and the grammatical and lexical characteristics of their speech. This method reveals which linguistic patterns may be characteristic of speakers from certain geographical areas, thus unveiling potential correlations. The phonetical part of our study aims to discover differences between vowel formants in Standard Australian and Australian Aboriginal English.

225-236
Abstract

The paper is devoted to the identification of the distribution of the dual number markers of nouns in an analytical and synthetic way in the southern, transitional, central and northern dialects and subdialects of Selkup. The material of the study is corpus data of more than 85,000 wordforms located on the digital platform “Lingvodoc” and in personal archives (Fieldworks Language Explorer files), as well as general grammatical and lexical papers on the language. It was revealed that the use of the basic Selkup suffix of the dual number -q(V) with nouns in Selkup dialects is distributed heterogeneously – the marker is not found in a number of materials, when it is attached directly to the stem (transitional subdialects, the southern part of the Narym dialect), however, it is used everywhere with animate nouns, denoting the complex of two homogeneous objects vie the suffixes of the mutual connection -sa- and the collective set mɨ-. In the southern part of the Narym dialect and in transitional subdialects new dualis formations are recorded in the form of the markers ‑štja or -štjaq(V). In the Vasyugan and northern materials cases of the doubled dualis with the suffixes of the mutual connection or the collective set are presented: -sa- / -mɨ- + -qV-q(V). It is necessary to separate the more northern and southern parts of the Narym dialect, where in the first case, the suffix of the dualis -q(V) appears, and in the second, innovations in the forms of the dualis like -štja and -štjaq(V) are noted. In all Selkup materials the main or one of the main strategies for marking the duality of nouns is an analytical strategy, which consist in the use of the numeral sitte ‘two’ and the noun in the singular (in some cases in the plural).

237-250
Abstract

Extractive summarization is a task of highlighting the most important parts of the text. We introduce a new approach to extractive summarization task using hidden clustering structure of the text. Experimental results on CNN/DailyMail demonstrate that our approach generates more accurate summaries than both extractive and abstractive methods, achieving state-of-the-art results in terms of ROUGE-2 metric exceeding the previous approaches by 10%. Additionally, we show that hidden structure of the text could be interpreted as aspects.

251-276
Abstract

Congestion control is a key aspect of modern networks. The first congestion control algorithms, such as TCP Tahoe and TCP Reno, were developed in the late 20th century, and their core concepts remain relevant to this day. With the development of high-speed networks, specialized algorithms such as TCP BIC and TCP CUBIC were created, which are adapted to these conditions. However, classical algorithms with predefined rules are not always effective in all network environments, and with the rise of 4G, 5G, and satellite communications, the congestion control issue has become increasingly relevant. This has led to the emergence of numerous works on machine learning-based congestion control algorithms, particularly reinforcement learning, which can adapt to dynamically changing network conditions. This paper presents and reviews both classical congestion control algorithms and the most popular and recent machine learning-based algorithms, along with some implementations using multipath. Additionally, it highlights the most significant challenges of machine learning-based algorithms and discusses potential directions for future research in this field.

277-290
Abstract

The paper discusses methods of runtime verification of software systems that are security protection mechanisms (PM) or include such mechanisms in their design. To ensure a high level of trust and security of software systems, it is necessary to use different verification methods and technologies. In this case, not only the potential power of the method is important, but also the possibility of using it in real conditions of industrial development of large and complex software systems. The rigor and accuracy of verification are ensured by formal methods; however, the use of classical formal methods dictates special, extremely high requirements for personnel and entails additional labor costs. The article proposes a technology for runtime verification of PM, which, on the one hand, is close to testing techniques, therefore it is easier for test engineers to master, and, on the other hand, uses formal access control models and specifications of external PM interfaces as a basis, which are already appearing among OS and DBMS developers, whose products must meet the requirements of the new national standard GOST R 59453.4-2025 "Information Security. Formal access control model. Part 4. Recommendations for verification of information security tools implementing access control policies based on formalized descriptions of the access control model. This standard is also presented in the article.

291-302
Abstract

The paper presents an approach to detecting memory and other resource leaks in the Svace static analyzer. We lay down a set of static analysis requirements that show the philosophy behind Svace, briefly describe the main analysis infrastructure based on an interprocedural symbolic execution with state merging, and show how this infrastructure can be applied to leak detection. We list attributes that are computed during analysis and present how this computation is performed, how allocation and release functions are modeled via specifications, how escaped memory is accounted for. We propose a way to provide external information not present in the source code to the analyzer via creating artificial functions. Experimental results on the Juliet test suite and the Binutils package show the viability of our ideas. 

303-310
Abstract

Software testing of automated systems at different stages of their life cycle (LC) differs in terms of goals, tasks, objects, methods, and test results, despite the difference in the properties of these objects. At the same time, the same terms are used in the scientific and technical literature to describe different types of objects, despite the difference in their properties and methods of working with them during testing. The purpose of the work is to consider the complex of concepts used in the software lifecycle of automated systems in the field of testing and to change the semantic content of these concepts depending on what properties the automated system software should have at the current stage of the LC. Accordingly, the characteristics of the software under test, the volume of tests, the degree of compliance of the test object and the resources necessary for a specific type of testing for its application change. Understanding these differences significantly affects the types and methods of testing used, as well as the requirements for test automation tools.

311-324
Abstract

Alternative levels of detail (LOD) are one of the most promising approaches to effective rendering of complex spatial 3D scenes. The approach has been realized in the hierarchical levels of detail (HLOD) and hierarchical dynamic levels of detail (HDLOD) methods, which are currently well studied and successfully applied for conservative and interactive rendering of large dynamic scenes. At the same time, the issues of efficient generation of levels of detail, which are critical in a number of applications related to visual modeling of complex industrial projects and large-scale infrastructure programs, have not received due attention. The paper considers emerging techniques for accelerating HLOD and HDLOD generation. It also discusses the possibility of quickly updating hierarchical levels of detail, taking into account permanent local changes in the three-dimensional model, typical for collaborative applications.

325-354
Abstract

This paper presents a systematic review of hardening mechanisms for operating systems and user applications. Various types of protection mechanisms are discussed, including memory protection mechanisms, hardware stack protection, dynamic memory protection, address space randomization, control flow protection, and system integrity protection. The principles of these mechanisms, their effectiveness, and their impact on system performance are analyzed in detail. Special attention is given to the implementation of protective mechanisms in modern operating systems, particularly in the Linux kernel. This work is intended for information security specialists, operating system developers, and researchers working on information security issues.



Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2079-8156 (Print)
ISSN 2220-6426 (Online)