Security-by-Design is an important approach to ensure software security and reliability. It has been developing already for more than 50 years, but its principles and techniques are still not well known among wide society of software developers. To make the approach more familiar and popular we need to reestablish its goals and problems, to classify and explain its techniques, and formulate trends of its future development. This paper reformulates the main principles of Security-by-Design, provides some examples of security design patterns and anti-patterns, and also explores relations between the approach and software architecture analysis methods, hardening techniques, and safe programming languages.
In this work, we discuss methods for identifying build requisites, and describe their strengths and weaknesses. The buildography tool is presented that provides logging of the build process by tracking system calls. An estimate of the time spent on build process using the buildography tool is given.
Recently, heterogeneous computer systems have been widely used to solve computational tasks with strict constraints on performance (throughput) and power consumption. Typically, such systems consist of general-purpose microprocessors and FPGA-based hardware accelerators implementing the most expensive operations (which are usually application-specific ones). This article is devoted to the design automation of hardware accelerators for streaming data computing. The features of this type of accelerators (and the problems they solve) are as follows: (1) continuous (cycle-by-cycle) reception and production of data; (2) bounded (in time and memory) output-input dependence. Streaming data computing covers a wide range of applications, including digital signal processing, traffic encryption, numerical modeling, bioinformatics, etc. The paper introduces the concept of DFCIR (DataFlow Computer Intermediate Representation), а language for an intermediate representation of streaming data computing designs. The DFCIR language is based on the open compiler infrastructure MLIR (Multi-Level Intermediate Representation). RTL models of accelerators are built from DFCIR descriptions with the use of CIRCT (Circuit IR Compilers and Tools), a subproject of MLIR that combines tools for working with hardware designs.
Authors describe a system which, given a set of designer-specified layout constraints (guidelines) and a description of graphic user interface (GUI) logical structure generates a set of particular layouts. Each of these layouts comply with given guidelines by construction. Authors also give a formal treatment of the task as a constraint satisfiability problem and describe the construction of a sound and complete solver based on the utilization of relational verifier-to-solver approach. They also describe a number of refinements which make the solver more efficient and applicable.
In the article, taking into account the limited number of copies of a structured software resource, a comparative analysis of mathematical relationships for calculating the total execution time of a set of identically distributed competing processes in asynchronous and two synchronous modes was carried out; in the case of unlimited and limited parallelism by the number of processors of a multiprocessor system, a sufficient condition for the efficiency of an identically distributed system was obtained , a necessary and sufficient condition for the existence of an efficient system of identically distributed competing processes has been proven depending on the amount of additional system costs.
An integral part of the process of creating high-performance computing systems designed to solve problems of numerical modeling of various physical processes is to check their compliance with the characteristics stated during their design. The article discusses a script-based runtime environment developed by the authors for using methodical applied tests to numerically investigate parameters of high-performance computing systems. This environment enables efficient analysis of the results of applying tests and provides an assessment of the performance and reliability of high-performance computing systems.
Variational inequalities as an effective tool for solving applied problems, including machine learning tasks, have been attracting more and more attention from researchers in recent years. The use of variational inequalities covers a wide range of areas – from reinforcement learning and generative models to traditional applications in economics and game theory. At the same time, it is impossible to imagine the modern world of machine learning without distributed optimization approaches that can significantly speed up the training process on large amounts of data. However, faced with the high costs of communication between devices in a computing network, the scientific community is striving to develop approaches that make computations cheap and stable. In this paper, we investigate the compression technique of transmitted information and its application to the distributed variational inequalities problem. In particular, we present a method based on advanced techniques originally developed for minimization problems. For the new method, we provide an exhaustive theoretical convergence analysis for cocoersive strongly monotone variational inequalities. We conduct experiments that emphasize the high performance of the presented technique and confirm its practical applicability.
With the increasing use of artificial intelligence (AI) models, more attention is being paid to issues of trust and security in AI systems against various types of threats (evasion attacks, poisoning, membership inference, etc.). In this work, we focus on the task of graph node classification, highlighting it as one of the most complex. To the best of our knowledge, this is the first study exploring the relationship between defense methods for AI models against different types of threats on graph data. Our experiments are conducted on citation and purchase graph datasets. We demonstrate that, in general, it is not advisable to simply combine defense methods for different types of threats, as this can lead to severe negative consequences, including a complete loss of model effectiveness. Furthermore, we provide theoretical proof of the contradiction between defense methods against poisoning attacks on graphs and adversarial training.
With the growing application of interpretable artificial intelligence (AI) models, increasing attention is being paid to issues of trust and security across all types of data. In this work, we focus on the task of graph node classification, highlighting it as one of the most challenging. To the best of our knowledge, this is the first study to comprehensively explore the relationship between interpretability and robustness. Our experiments are conducted on datasets of citation and purchase graphs. We propose methodologies for constructing black-box attacks on graph models based on interpretation results and demonstrate how adding protection impacts the interpretability of AI models.
Modern large language models are huge systems with complex internal mechanisms implementing black-box response generation. Although aligned large language models have built-in defense mechanisms against attacks, recent studies demonstrate the vulnerability of large language models to attacks. In this study, we aim to expand the existing malicious datasets obtained from attacks so that similar vulnerabilities in large language models can be addressed in the future through the alignment procedure. In addition, we conduct experiments with modern large language models on our malicious dataset, which demonstrates the existing weaknesses in the models.
This paper presents a method for the automatic generation of information extraction rules (sitemaps) for news websites. The proposed approach generates a sitemap based on a set of news pages from a single site, enabling attribute extraction from arbitrary news pages on that site. The method is based on applying a fine-tuned neural network model, MarkupLM, to extract information from web pages. This approach generalizes the model’s predictions at the site level, creating universal rules for attribute extraction. Experimental results show that using sitemaps generated with the fine-tuned model surpasses both existing open-source tools and the fine-tuned MarkupLM applied at the individual page level. The developed method can be extended to other domains if relevant data for model fine-tuning is available.
The paper shows that a pressing problem in the development of nanosatellites is the lack of open software for on-board computing devices and “smart” payloads. The development of an open software package for centralized management of target terminal devices of nanosatellites based on microservice architecture is considered. The advantages of using this approach when creating a software package are shown. It is proposed to use a nanosatellite simulation model for operational debugging and testing of the software package. The authors of the work present the structure of the software complex and show the place of the simulation model in it. The work is a detailed review of the UEMKA software package developed by the authors.
The density properties of subsiding loess soils were studied within the framework of mathematical modeling of their compaction using the method of deep explosions. Soil compaction is carried out to eliminate subsidence properties. Loess soils are characterized by low density and high porosity. The density properties of loess depend on the parameters of the diffusion interaction of gas atoms formed as a result of the explosion and the soil being compacted. Solving inverse applied problems that arise when studying mathematical models of geological systems allows us to systematize knowledge about them. The work considers the inverse problem of estimating the diffusion coefficient. Mathematical modeling of the vertical diffusion coefficient in anisotropic and isotropic geological systems was carried out. The case of complete absorption of gas atoms by the surrounding soil has been studied. A numerical assessment of the coefficient of vertical diffusion in soil before and after compaction has been implemented, over time and with an accuracy sufficient for engineering calculations. Gas diffusion coefficients in soils of various densities were obtained. The constructed mathematical relationships for estimating the coefficient of vertical diffusion make it possible to predict the density properties of soils at the stage of designing the foundations of construction projects.
The article explores the methods of designing security systems to ensure data confidentiality in telemedicine. The approaches to patient authentication used in tele-health care are considered, with an emphasis on their effectiveness and safety. The analysis of authentication methods, including biometric identification, two-factor authentication and the use of unique identification codes, was carried out. The advantages and disadvantages of each of these methods have been identified, which gives medical organizations the opportunity to make management decisions that best match the structure of information systems and acceptable levels of risk.
The aim was to investigate how does a city configuration influence on a lethal disease spread. For this purpose, several configurations for subareas with high and low population density are considered. The spread was simulated via a stochastic cellular automata approach, with the main indicators being determined as average over a set of runnings. Since the automaton was based on a SIRS model, as an economic indicator we used the simultaneous sick number, as a social – cumulative dead number. In addition, we considered Manshift losses as a cost loss parameter. The simulation results yield that for the minimal dead number and economic losses it is preferable to use a regular grid of square-shaped low-density subareas. Despite the model suggested indicates that the city planning is important for the pandemic damage minimization which can be reduced due to smart urban development policy.
This paper presents the effectiveness of using the H1 system for retrieving products from various vendors in the marketplace. The H1 system is a hybrid model that combines the benefits of lexical-based and semantic-based retrieval techniques, similar to other state-of-the-art product retrieval systems. The novelty of this approach lies in its combination of token-level retrievals. The advantage of the H1 system over other existing solutions is its ability to handle complex search queries containing brands with multi-word brands. For example, search queries like "new balance sneakers" and "Gloria Jeans children's clothing" will be split into separate tokens "new balance" and "Gloria Jeans", respectively, which helps reduce the retrieval model's size and improves its autonomous performance. The H1 system achieved mAP@12 score of 56.1% and R@1K score of 86.6% on the public WANDS dataset, outperforming other state-of-the-art models. These results demonstrate the effectiveness of the approach and its potential for improving product search experiences for online shoppers.
The scientific article considers the issues of practical quality assessment of modern machine learning models implemented on the basis of deep neural networks and visual transformers. The parameters of the conducted experiment on the ISIC 2018 dataset are described. The statistics on the categories of the considered skin lesions is given. The statistical analysis of the obtained results allowed the author's team to form a new binary category: melanocytic and non-melanocytic skin lesions. Experiments on training neural network models were performed at the facilities of the NCMU Digital Ecosystem.
ISSN 2220-6426 (Online)