One can meet the software architecture style's notion in the software engineering literature. This notion is considered important in books on software architecture and university sources. However, many software developers are not so optimistic about it. It is not clear, whether this notion is just an academic concept, or is actually used in the software industry. In this paper, we measured industrial software developers' attitudes towards the concept of software architecture style. We also investigated the popularity of eleven concrete architecture styles. We applied two methods. A developers’ survey was applied to estimate developers' overall attitude and define what the community thinks about the automatic recognition of software architecture styles. Automatic crawlers were applied to mine the open-source code from the GitHub platform. These crawlers identified style smells in repositories using the features we proposed for the architecture styles. We found that the notion of software architecture style is not just a concept of academics in universities. Many software developers apply this concept in their work. We formulated features for the eleven concrete software architecture styles and developed crawlers based on these features. The results of repository mining using the features showed which styles are popular among developers of open-source projects from commercial companies and non-commercial communities. Automatic mining results were additionally validated by the Github developers survey.
The static program analysis is gradually adopting advanced use cases, and integration with programming tools becomes more necessary than ever. However, each integration requires a different kind of functionality implemented within an analyzer. For example, continuous integration tools typically analyze projects from scratch, while doing the same for code querying is not efficient performance-wise. The code behind such use cases makes «service models», and it tends to differ significantly between them. In this paper, we analyze the models which might be used by the static analyzer to provide its services based on aspects of security, performance, long-term storage. All models are assigned to one of the groups: logical presence (where the actual computation is performed), resource acquisition, input/output, change accounting and historic data tracking. The usage recommendations, advantages and disadvantages are listed for each reviewed model. Input/output models are tested for actual network throughput. We also describe the model which might aggregate all these use cases. The model is partially evaluated within the work-in-progress static analyzer Equid, and the observations are presented.
Automated testing frameworks are widely used for assuring quality of modern software in secure software development lifecycle. Sometimes it is needed to assure quality of specific software and, hence specific approach should be applied. In this paper, we present an approach and implementation details of automated testing framework suitable for acceptance testing of static source code analysis tools. The presented framework is used for continuous testing of static source code analyzers for C, C++ and Python programs.
In this research-in-progress report, we propose a novel approach to unified cache usage analysis for implementing data layout optimizations in the LCC compiler for the Elbrus and SPARC architectures. The approach consists of three parts. The first part is generalizing two methods of estimating cache miss amount and choosing more applicable one in the compiler. The second part is finding an applicable solution for the problem of cache miss amount minimization. The third part is implementing this analysis in the compiler and using analysis results for data layout transformations.
Gradual typing is a modern approach for combining benefits of static typing and dynamic typing. Although scientific research aim for soundness of type systems, many of languages intentionally make their type system unsound for speeding up performance. This paper describes an implementation of a dialect for Lama programming language that supports gradual typing with explicit annotation of dangerous parts of code. The target of current implementation is to grant type safety to programs while keeping their power of untyped expressiveness. This paper covers implementation issues and properties of created type system. Finally, some perspectives on improving precision and soundness of type system are discussed.
The problem of automatic requests classification, as well as the problem of determining the routing rules for the requests on the server side, is directly connected with analysis of the user interface of dynamic web pages. This problem can be solved at the browser level, since it contains complete information about possible requests arising from interaction interaction between the user and the web application. In this paper, in order to extract the classification features, using data from the request execution context in the web client is suggested. A request context or a request trace is a collection of additional identification data that can be obtained by observing the web page JavaScript code execution or the user interface elements changes as a result of the interface elements activation. Such data, for example, include the position and the style of the element that caused the client request, the JavaScript function call stack, and the changes in the page's DOM tree after the request was initialized. In this study the implementation of the Chrome Developer Tools Protocol is used to solve the problem at the browser level and to automate the request trace selection.
Over the past decade, the Internet has become the gigantic and richest source of data. The data is used for the extraction of knowledge by performing machine leaning analysis. In order to perform data mining of the web-information, the data should be extracted from the source and placed on analytical storage. This is the ETL-process. Different web-sources have different ways to access their data: either API over HTTP protocol or HTML source code parsing. The article is devoted to the approach of high-performance data extraction from sources that do not provide an API to access the data. Distinctive features of the proposed approach are: load balancing, two levels of data storage, and separating the process of downloading files from the process of scraping. The approach is implemented in the solution with the following technologies: Docker, Kubernetes, Scrapy, Python, MongoDB, Redis Cluster, and СephFS. The results of solution testing are described in this article as well.
The paper provides an overview of the first impression of the language for implementation low code of approach. About a month has passed since the release date.
These days, most of time-critical business processes are performed using computer technologies. As an example, one can consider financial processes including trading on stock exchanges powered by electronic communication protocols such as the Financial Information eXchange (FIX) Protocol. One of the main challenges emerging with such processes concerns maintaining the best possible performance since any unspecified delay may cause a large financial loss or other damage. Therefore, performance analysis of time-critical systems and applications is required. In the current work, we develop a novel method for a performance analysis of time-critical applications based on the db-net formalism, which combines the ability of colored Petri nets to model a system control flow with the ability to model relational database states. This method allows to conduct a performance analysis for time-critical applications that work as transactional systems and have log messages which can be represented in the form of table records in a relational database. One of such applications is a FIX protocol-based trading communication system. This system is used in the work to demonstrate applicability of the proposed method for time-critical systems performance analysis. However, there are plenty of similar systems existing for different domains, and the method can also be applied for a performance analysis of these systems. The software prototype is developed for testing and demonstrating abilities of the method. This software prototype is based on an extension of Renew software tool, which is a reference net simulator. The testing input for the software prototype includes a test log with FIX messages, provided by a software developer of testing solutions for one of the global stock exchanges. An application of the method for quantitative analysis of maximum acceptable delay violations is presented. The developed method allows to conduct a performance analysis as a part of conformance checking of a considered system. The method can be used in further research in this domain as well as in testing the performance of real time-critical software systems.
Modelling is considered as a universal approach to define and simplify real-world applications through appropriate abstraction. Model-driven system engineering identifies and integrates appropriate concepts, techniques, and tools which provide important artefacts for interdisciplinary activities. In this paper, we show how we used a model-driven approach to design and improve a Digital Humanities dynamic web application within an interdisciplinary project that enables history students and volunteers of history associations to transcribe a large corpus of image-based data from the General Register Office (GRO) records. Our model-driven approach generates the software application from data, workflow and GUI abstract models, ready for deployment.
The true concurrency models, and in particular event structures, have been introduced in the 1980s as an alternative to operational interleaving semantics of concurrency, and nowadays they are regaining popularity. Event structures represent the causal dependency and conflict between the individual atomic actions of the system directly. This property leads to a more compact and concise representation of semantics. In this work-in-progress report, we present a theory of event structures mechanized in the COQ proof assistant and demonstrate how it can be applied to define certified executable semantics of a simple parallel register machine with shared memory.
In this paper, we present an approach to the generation of Petri nets exhibiting desired structural and behavioral properties. Given a reference Petri net, we apply a collection of local refinement transformations, which extends the internal structure of the reference model. The correctness of applying these transformations is justified via Petri net morphisms and by the fact that transformations do not add new deadlocks to Petri nets. We have designed two Petri net refinement algorithms supporting the randomized and fixed generation of models. These algorithms have been implemented and evaluated within the environment of the Carassius Petri net editor. The proposed approach can be applied to evaluate and conduct experiments for algorithms operating with Petri nets.
These days, real-time analytics is one of the most often used notions in the world of databases. Broadly, this term means very fast analytics over very fresh data. Usually the term comes together with other popular terms, hybrid transactional/analytical processing (HTAP) and in-memory data processing. The reason is that the simplest way to provide fresh operational data for analysis is to combine in one system both transactional and analytical processing. The most effective way to provide fast transactional and analytical processing is to store an entire database in memory. So on the one hand, these three terms are related but on the other hand, each of them has its own right to life. In this paper, we provide an overview of several in-memory data management systems that are not HTAP systems. Some of them are purely transactional, some are purely analytical, and some support real-time analytics. Then we overview nine in-memory HTAP DBMSs, some of which don't support real-time analytics. Existing real-time in-memory HTAP DBMSs have very diverse and interesting architectures although they use a number of common approaches: multiversion concurrency control, multicore parallelization, advanced query optimization, just in time compilation, etc. Additionally, we are interested whether these systems use non-volatile memory, and, if yes, in what manner. We conclude that an emergence of new generation of NVM will greatly stimulate its use in in-memory HTAP systems.
Large text can convey various forms of sentiment information including the author’s position, positive or negative effects of some events, attitudes of mentioned entities towards to each other. In this paper, we experiment with BERT based language models for extracting sentiment attitudes between named entities. Given a mass media article and list of mentioned named entities, the task is to ex tract positive or negative attitudes between them. Efficiency of language model methods depends on the amount of training data. To enrich training data, we adopt distant supervision method, which provide automatic annotation of unlabeled texts using an additional lexical resource. The proposed approach is subdivided into two stages FRAME-BASED: (1) sentiment pairs list completion (PAIR-BASED), (2) document annotations using PAIR-BASED and FRAME-BASED factors. Being applied towards a large news collection, the method generates RuAttitudes2017 automatically annotated collection. We evaluate the approach on RuSentRel-1.0, consisted of mass media articles written in Russian. Adopting RuAttitudes2017 in the training process results in 10-13% quality improvement by F1-measure over supervised learning and by 25% over the top neural network based model results.
ISSN 2220-6426 (Online)