Preview

Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS)

Advanced search

Path querying on acyclic graphs using Boolean grammars

https://doi.org/10.15514/ISPRAS-2019-31(4)-14

Abstract

Graph data models are widely used in different areas of computer science such as bioinformatics, graph databases, social networks and static code analysis. One of the problems in graph data analysis is querying for specific paths. Such queries are usually performed by means of a formal grammar that describes the allowed edge-labeling of the paths. Path query is said to be calculated using relational query semantics if it is evaluated to triple , such that there is a path from  to  such that the labels on the edges of this path form a string derivable from the nonterminal . As the regular and context-free languages have limited expressive power, we focus on a more expressive languages, namely the Boolean languages that use Boolean grammars to describe the labeling of paths. Although path querying using relational query semantics and Boolean grammars is known to be undecidable, in this work we propose a path querying algorithm on acyclic graphs which uses relational query semantics and Boolean grammars and approximates the exact solution. To achieve better performance in compare with the naive algorithm, considered classes of graphs were limited to acyclic graphs.

About the Authors

Ekaterina Nikolaevna Shemetova
Saint Petersburg National Research University of Information Technologies, Mechanics and Optics
Russian Federation


Semyon Vyatcheslavovitch Grigorev
Saint Petersburg State University
Russian Federation
PhD in Phisical and Mathematical sciences, associated professor of the Department of Informatics at St Petersburg University


References

1. Barceló Baeza P. Querying graph databases. In Proc. of the 32nd ACM SIGMOD-SIGACT-SIGAI symposium on Principles of database systems (PODS '13), 2013, pp. 175-188.

2. Mendelzon A., Wood P. Finding Regular Simple Paths in Graph Databases. SIAM Journal on Computing, vol. 24, № 6, 1995, pp. 1235-1258.

3. Anderson J. W. et al. Quantifying variances in comparative RNA secondary structure prediction. BMC bioinformatics, vol. 14, 2013.

4. Chaudhary A., Faisal A. Role of graph databases in social networks. 2016, available at: https://www.researchgate.net/publication/304462174_Role_of_graph_databases_in_social_networks, accessed 02.06.2019.

5. Warchał, Ł. Using Neo4j graph database in social network analysis. Studia Informatica, vol. 33, № 2A, 2012, pp. 271-279.

6. Reps T. Program analysis via graph reachability. In Proc. of the 1997 international symposium on Logic programming (ILPS '97), 1997, pp. 5-19.

7. Zhang Q., Su Z. Context-sensitive data-dependence analysis via linear conjunctive language reachability. In Proc. of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages, 2017, pp. 344–358.

8. Yuan H., Eugster P. An Efficient Algorithm for Solving the Dyck-CFL Reachability Problem on Trees. In Proc. of the 18th European Symposium on Programming Languages and Systems: Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009 (ESOP '09), 2009, pp. 175-189.

9. Neo4j's Graph Query Language: An Introduction to Cypher. Available at: https://neo4j.com/developer/cypher-query-language/, accessed 02.06.2019.

10. Rodriguez M. A. The gremlin graph traversal machine and language (invited talk). In Proc. of the 15th Symposium on Database Programming Languages, 2015, pp. 1–10.

11. Eric Prud'hommeaux, Andy Seaborne (eds). SPARQL query language for RDF. W3C Recommendation 15 January 2008. Available at: https://www.w3.org/TR/2008/REC-rdf-sparql-query-20080115/, accessed 02.06.2019.

12. Okhotin A. Conjunctive grammars. Journal of Automata, Languages and Combinatorics, vol. 6, № 4, 2001, pp. 519–535.

13. Okhotin A. Boolean grammars. Information and Computation, vol. 194, issue 1, 2004, pp. 19–48.

14. Hellings J. Querying for Paths in Graphs using Context-Free Path Queries. CoRR abs/1502.02242, 2015.

15. Hellings. J. Conjunctive context-free path queries. In Proc. of the International Conference on Database Theory (ICDT’14), 2014, pp.119–130.

16. Sevon P., Eronen L. Subgraph queries by context-free grammars. Journal of Integrative Bioinformatics, vol. 5, № 2, 2008.

17. Zhang X. et al. Context-free path queries on RDF graphs. In Proc. of the International Semantic Web Conference, 2016, pp. 632–648.

18. Grigorev S., Ragozina A. Context-free path querying with structural representation of result. In Proc. of the 13th Central & Eastern European Software Engineering Conference in Russia (CEE-SECR '17), article 10.

19. Azimov R.Sh., Grigorev S.V. Context-Free Path Querying by Matrix Multiplication. CoRR, vol. abs/1707.01007, 2017.

20. Азимов Р.Ш., Григорьев С.В. Синтаксический анализ графов с использованием конъюнктивных грамматик. Труды ИСП РАН, том 30, вып. 2, 2018 г., стр. 149-166 / Azimov R.Sh., Grigorev S.V. Path querying using conjunctive grammars. Trudy ISP RAN/Proc. ISP RAS, vol. 30, issue. 2, 2018, pp. 149-166 (in Russian). DOI: 10.15514/ISPRAS-2018-30(2)-8.

21. Valiant L.G. General context-free recognition in less than cubic time. Journal of computer and system sciences, vol. 10, № 2, 1975, pp. 308– 315.

22. Okhotin A. Parsing by matrix multiplication generalized to Boolean grammars. Theoretical Computer Science, vol. 516, 2014, pp. 101–120.

23. Арлазаров В.Л., Диниц Е.А., Кронрод М.А., Фараджев И.А. Об экономном построении транзитивного замыкания ориентированного графа. Доклады АН СССР, том 194, № 3, 1970, pp. 487–488 / Arlazarov V.L., Dinitz Y.A., Kronrod M.A., Faradzhev I.A. On economical construction of the transitive closure of an oriented graph. Doklady Akademii Nauk SSSR, vol. 194, № 3, 1970, pp. 487–488 (in Russian).

24. Vassilevska Williams V. Multiplying matrices faster than Coppersmith-Winograd. In Proc. of the 44th Symposium on Theory of Computing, (STOC2012), 2012, pp. 887–898.

25. Tarjan R.E. Depth-first search and linear graph algorithms. In Proc. of the 12th Annual Symposium on Switching and Automata Theory (SWAT 1971), 1971, pp. 114-121.

26. Koschmieder André, Leser Ulf. Regular Path Queries on Large Graphs. In Proc. of the 24th International Conference on Scientific and Statistical Database Management (SSDBM’12), 2012, pp. 177–194.

27. Reutter Juan L., Romero Miguel, Vardi Moshe Y. Regular Queries on Graph Databases. Theory of Computing Systems, vol. 61, № 1, 2017, pp. 31–83.

28. Lu C., Yu J. X., Li R., Wei H. Exploring Hierarchies in Online Social Networks. IEEE Transactions on Knowledge and Data Engineering, vol. 28, № 8, 2016, pp. 2086-2100.

29. Sridharan, M., Gopan, D., Shan, L., Bod ́ık, R. Demand-driven points-to analysis for Java. In Proc. of the 20th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA2005), 2005, pp. 59–76.

30. Терехов А.Н., Эрлих Л.А., Терехов А.А. История и архитектура проекта RescueWare. В сб. «Автоматизированный реинжиниринг программ», Издательство Санкт-Петербургского государственного университета, 2000 г., стр. 7-19 / Terekhov A.N., Erlikh L.A., Terekhov A.A. History and architecture of a project RescueWare. In «Automated software reengineering», Publishing House of St. Petersburg State University, 2000, pp. 7-19 (in Russian).

31. Medeiros Ciro M., Musicante Martin A., Costa Umberto S. Efficient evaluation of context-free path queries for graph databases. In Proc. of the 33rd Annual ACM Symposium on Applied Computing (SAC '18), 2018, pp. 1230-1237.

32. Rosenkrantz D.J., Stearns R.E. Properties of deterministic top-down grammars. Information and Control, vol. 17, № 3, 1970, pp. 226-256.

33.


Review

For citations:


Shemetova E.N., Grigorev S.V. Path querying on acyclic graphs using Boolean grammars. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2019;31(4):211-226. (In Russ.) https://doi.org/10.15514/ISPRAS-2019-31(4)-14



Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2079-8156 (Print)
ISSN 2220-6426 (Online)