Preview

Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS)

Advanced search
Vol 32, No 1 (2020)
View or download the full issue PDF (Russian)
7-26
Abstract

Designing a trusted access control mechanism of an operating system (OS) is a complex task if the goal is to achieve high level of security assurance and guarantees of unwanted information flows absence. Even more complex it becomes when the integration of several heterogeneous mechanisms, like role-based access control (RBAC), mandatory integrity control (MIC), and multi-level security (MLS) is considered. This paper presents results of developement of a hierarchical integrated model of access control and information flows (HIMACF), which provides a holistic integration of RBAC, MIC, and MLS preserving key security properties of all those mechanisms. Previous version of this model is called MROSL~DP-model. Now the model is formalized using Event-B formal method and its correctness is formally verified. In the hierarchical representation of the model, each hierarchy level (module) corresponds to a separate security control mechanism, so the model can be verified with less effort reusing the results of verification of lower level modules. The model is implemented in a Linux-based operating system using the Linux Security Modules infrastructure.

27-56
Abstract

Existing models of mandatory integrity control in operating systems restrict accesses of active components of a system to passive ones and represent the accesses directly: subjects get read or write access to objects. Such a representation can be used in modeling of monolithic operating systems whose components that provide access to resources are part of the trusted computing base. However, the implementation of these components is extremely complex. Therefore, it is arduous to prove the absence of bugs (vulnerabilities) in them. In other words, proving such a model to be adequate to the real system is nontrivial and often left unsolved. This article presents a mandatory integrity control model for a microkernel operating system called KasperskyOS. Microkernel organization of the system allows us to minimize the trusted computing base to include only the microkernel and a limited number of other components. Parts of the system that provide resource access are generally considered untrusted. Even if some of them are erroneous, the operating system can still provide particular security guarantees. To prove that by means of a model, we introduce the notion of object drivers as intermediaries in operations on objects. We define the requirements that object drivers must satisfy. We also add the means for analysis of the consequences of violations of the requirements. We state and prove that the model either preserves integrity if all active components satisfy the requirements, or restricts the negative impact if some of the components are compromised. Correct implementation of the model guarantees that compromised components will not affect components with higher or incomparable integrity levels. We describe a policy specification language developed in accordance with the model. We provide an example of using it to describe a security policy that ensures a correct update of a system operated by KasperskyOS.

57-70
Abstract

The paper discusses the creation of rendering systems for airborne civil aviation systems. All software used on board must comply with internationally accepted safety standards. This imposes additional requirements on both the hardware used and the system development process. This work is devoted to the specifics of using multi-core processors in aviation embedded systems to improve the performance of software implementation of the OpenGL SC library. The possibility of using multi-core processors in safety-critical systems is provided by the Russian real-time operating system JetOS. Implementation of multi-window rendering using the software OpenGL SC library is also considered.

71-88
Abstract

The article describes the technology of automatic software testing in relation to industrial systems of computer graphics and optical simulation. Test automation becomes vital in the face of limited resources with the frequent release of product versions, which often occur among software product manufacturers. There are presented both methods of regression testing the computational kernel of such systems, and methods of testing the user interface. Scripting mechanism based on Python is used for regression testing, its multithreading capabilities which allow significant decreasing of testing time are also described. Python allows two ways of parallelization – multithreading and multiprocessing, both of them are considered. Due to the stochastic methods used in optical simulation calculation results may differ from time to time, which complicates regression testing. In this case, it is proposed to apply some (in each case - your own) threshold when comparing the simulation results. Separately automated testing of user interface which was elaborated basing on the AutoIt tool is described. The approach for testing the user interface of systems implemented in the form of plugins to existing CAD/PDM complexes, the source code of which is closed and not available to the authors of automatic tests, are described as well.

89-108
Abstract

This paper is dedicated to the analysis of the existing approaches to video codecs comparisons. It includes the revealed drawbacks of popular comparison methods and proposes new techniques. The performed analysis of user-generated videos collection showed that two of the most popular open video collections from media.xiph.org which are widely used for video-codecs analysis and development do not cover real-life videos complexity distribution. A method for creating representative video sets covering all segments of user videos the spatial and temporal complexity is also proposed. One of the sections discusses video quality estimation algorithms used for video codec comparisons and shows the disadvantages of popular methods VMAF and NIQE. Also, the paper describes the drawbacks of the BD-rate – generally used method for video codecs final ranking during comparisons. A new ranking method called BSQ-rate which considers the identified issues is proposed. The results of this investigation were obtained during the series of research conducted as part of the annual video-codecs comparisons organized by video group of computer graphics and multimedia laboratory at Moscow State University.

109-120
Abstract

Augmented reality, as a result of introducing into the field of perception data providing the best visualization of information, is increasingly attracting the attention of specialists in the field of software for demonstration complexes and geographic information systems. The visibility of the visualized information is important both for the work of the operator and users of the information system. Using the laws of visual perception of objects associated with the properties of the «golden» section, it is possible to formulate a visualization criterion for visualized data that characterizes the integrated perception of information displayed on the screen of a video monitor or projection panel. The purpose of the study presented in the article is to determine the visualization criterion for visualized data based on the properties of the «golden» section and study the conditions for its provision by the example of displaying of metadata on a monitor screen and projection panel. The visualization criterion is determined through the coverage coefficient of the screen area with information. The optimal value of the coefficient corresponds to the mathematical definition of the «golden» section. As a result of the study, it is necessary to highlight the analysis of the properties of the «golden» section when displaying information and the definition of the visualization criterion for data visualization, which allows operators and consumers to comprehensively perceive video data on electronic projection tools. Iterative algorithms have been developed for selecting the scale of data display by the criterion of visibility: an algorithm for analyzing the displayed layer data using an electronic map as an example and an algorithm for sequential layer analysis. The influence of the scale of the displayed data on the visibility of their visualization on screens of various sizes is investigated. The practical value of the results lies in the fact that the proposed criterion represents a mathematical interpretation of the property of the «golden» section for the visualization of information on modern electronic means of displaying data.

121-136
Abstract

The goal of the research is to develop and to test methods for detecting people, parametric points for their hands and their current working tools in the video frames. The following algorithms are implemented: humans bounding boxes coordinates detection in video frames; human pose estimation: parametric points detection for each person in video frames; detection of the bounding boxes coordinates of the defined tools in video frames; estimation of which instrument the person is using at the certain moment. To implement algorithms, the existing computer vision models are used for the following tasks: Object detection, Pose estimation, Object overlaying. Machine learning system for working time detection based on computer vision is developed and deployed as a web-service. Recall, precision and f1-score are used as a metric for multi-classification problem. This problem is defined as what type of tool the person uses in a certain frame of video (Object Overlaying). Problem solution for action detection for the railway industry is new in terms of work activity estimation from video and working time optimization (based on human action detection). As the videos are recorded with a certain positioning of cameras and a certain light, the system has some limitations on how video should be filmed. Another limitation is the number of working tools (pliers, wrench, hammer, chisel). Further developments of the work might be connected with the algorithms for 3D modeling, modeling the activity as a sequence of frames (RNN, LSTM models), Action Detection model development, time optimization for the working process, recommendation system for working process from video activity detection.

137-152
Abstract

Topic modeling is an area of natural language processing that has been actively developed in the last 15 years. A probabilistic topic model extracts a set of hidden topics from a collection of text documents. It defines each topic by a probability distribution over words and describes each document with a probability distribution over topics. The exploding volume of text data motivates the community to constantly upgrade topic modeling algorithms for multiprocessor systems. In this paper, we provide an overview of effective EM-like algorithms for learning latent Dirichlet allocation (LDA) and additively regularized topic models (ARTM). Firstly, we review 11 techniques for efficient topic modeling based on synchronous and asynchronous parallel computing, distributed data storage, streaming, batch processing, RAM optimization, and fault tolerance improvements. Secondly, we review 14 effective implementations of topic modeling algorithms proposed in the literature over the past 10 years, which use different combinations of the techniques above. Their comparison shows the lack of a perfect universal solution. All improvements described are applicable to all kinds of topic modeling algorithms: PLSA, LDA, MAP, VB, GS, and ARTM.

153-180
Abstract

Many experts in the field of data management believe that the emergence of non-volatile byte-addressable main memory (NVM) available for practical use will lead to the development of a new type of ultra-high-speed database management systems (DBMS) with single-level data storage (native in-NVM DBMS). However, the number of researchers who are actively engaged in research of architectures of native in-NVM DBMS has not increased in recent years. The most active researchers are PhD students that are not afraid of the risks, which, of course, exist in this new area. The second section of the article discusses the state of the art in NVM hardware. The analysis shows that NVM in the DIMM form factor has already become a reality, and that in the near future we can expect the appearance on the market NVM-DIMMs with the speed of conventional DRAM and endurance close to that of hard drives. The third section is devoted to the review of related works, among which the works of young researchers are the most advanced. In the fourth section, we state and justify that the work performed so far in the field of in-NVM DBMS, did not lead to the emergence of a native architecture. This is hampered by the set of limiting factors analyzed by us. In this regard, in the fifth section, we present a sketch of the native architecture of an in-NVM DBMS, the choice of which is influenced only by the goals of simplicity and efficiency. In conclusion, we summarize the article and argues the need for additional research into many aspects of the native architecture of an in-NVM DBMS.

181-204
Abstract
The term Big Data refers to an extensive collections of digital data generating every second. Produced datasets come in structured, semi-structured, and unstructured formats throughout the world, which is difficult for the traditional database management systems to analyze. Recently, big data analytics emerges as an essential research area due to the popularity of the Internet and the advent of new Web technologies. This growing area of research represents a multi-disciplinary that attracts researchers from various research fields. Interested researchers are invited to design, develop, and implement several tools, technologies, architecture, and platforms for analyzing these large volumes of data. This paper begins with a brief introduction to big data and related concepts, including the main characteristics of big data, followed by discussions of the most significant open research challenges and emerging trends. Next, reviewing a study of big data analytics, the advantages of using big data solutions, and the preliminary assessments required before migrating from traditional solutions. Finally, presenting a review of the recent main applications to obtain a broad perspective of big data analytics.
205-220
Abstract

As the efficiency of main and external memory grows, alongside with decreasing hardware costs, the performance of database management systems (DBMS) on certain kinds of queries is more determined by CPU characteristics and the way it is utilized. Relational DBMS utilize diverse execution models to run SQL queries. Those models have different properties, but in either way suffer from substantial overhead during query plan interpretation. The overhead comes from indirect calls to handler functions, runtime checks and large number of branch instructions. One way to solve this problem is dynamic query compilation that is reasonable only in those cases when query interpretation time is larger than the time of compilation and optimized machine code execution. This requirement can be satisfied only when the amount of data to be processed is large enough. If query interpretation takes milliseconds to finish, then the cost of dynamic compilation can be hundreds of times more than the execution time of generated machine code. To pay off the cost of dynamic compilation, the generated machine code has to be reused in subsequent executions, thus saving the cost of code compilation and optimization. In this paper, we examine the method of machine code caching in our query JIT-compiler for DBMS PostgreSQL. The proposed method allows us to eliminate compilation overhead. The results show that dynamic compilation of queries with machine code caching feature gives a significant speedup on OLTP queries.



Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2079-8156 (Print)
ISSN 2220-6426 (Online)