Usage of i-Vectors for Automated Determination  of a Similarity Level between Languages

Ansis Ataols Bērziņš

doi:10.15514/ISPRAS-2019-31(5)-12

Usage of i-Vectors for Automated Determination of a Similarity Level between Languages

Ansis Ataols Bērziņš

https://doi.org/10.15514/ISPRAS-2019-31(5)-12

Full Text:

PDF (Rus)

Generate QR code

Abstract

The article describes results of applying i-vectors-based (both LID and SID) speech identification methods to define a kind of a distance between languages (in a wide sense of the word – including dialects and any other forms of spoken language). Spontaneous speech recordings of many enough speakers of languages are used on the input of the method. The experiments were carried out at recordings of Latvian and Latgalian dialects, but the method is applicable to any other idioms. Cosine similarity, Euclidean metric, standardized Euclidean metric, Jordan (or Chebyshov) metric and city block (or L₁) metric were tried out. Cosine similarity worked well for SID i-vectors, but for unknown reasons was senseless for LID i-vectors. Jordan metric worked well for LID, but was not good enough for SID i-vectors. Standardization of the Euclidean metric does not gave any improvement. Thus, the conclusions are: 1) both SID and LID vectors of full length recordings of spontaneous speech are characterizing and representing languages good enough to be used for detection of a distance between languages; 2) the best metrics for such tasks are Euclidean and L₁ (for arithmetic mean vectors computed from i-vectors of all informants coordinate by coordinate).

Keywords

speech, idiom, language, dialect, i-vector, LID, SID, recording, proximity of languages, distance between languages

About the Author

Ansis Ataols Bērziņš

http://ansis.lv/
University of Latvia
Latvia
Master of Mathematics, completing his work on thesis on computational linguistics.

References

1. A.A. Bērziņš. The Principles of Collection of Information for Automated Analyse of Audio Recordings. Tbilisi, Meridiani, 2011, pp. 39–46 (in Georgian and Russian) / А.У. Берзинь. Принципы сбора информации для автоматизированного анализа фонограм. Тбилиси, Меридиани, 2011 / ბერზინი ა. ინფორმაციის მოპოვების პრინციპები ფონოგრამების ავტომატური ანალიზისთვის. ქართული ენა და თანამედროვე ტექნოლოგიები, თბილისი, მერიდიანი, 2011

2. Zha Sh., Peng X., Cao H., Zhuang X., Natarajan P., Natarajan P. Text Classification via iVector Based Feature Representation. In Proc. of the 11th IAPR International Workshop on Document Analysis Systems, 2014, pp. 151-155.

3. Dehak N., Dehak R., Kenny P., Brummer N., Ouellet P., Dumouchel P. Support vector machines versus fast scoring in the low-dimensional total variability space for speaker verification. In Proc. of the Interspeech Conference, 2009, pp. 1559-1562.

4. Dehak N., Kenny P.J., Dehak R., Dumouchel P., Ouellet P. Front-End Factor Analysis for Speaker Verification. IEEE Transactions on Audio, Speech, And Language Processing, vol. 19, no. 4, 2011, pp. 788-798.

5. Dehak N., Torres-Carrasquillo P.A., Reynolds D., Dehak R. Language Recognition via Ivectors and Dimensionality Reduction. In Proc. of the Interspeech Conference, 2011, pp. 857-860.

6. Soufifar M. Subspace Modeling of Discrete Features for Language Recognition. Doctoral theses, Trondheim, NTNU, 2014.

7. Glembek O., Burget L., Matejka P. Voice Biometry Standard, Draft. Brno: Speech@FIT, 2015.

8. Han J., Kamber M., Pei J. Data Mining: Concepts and Techniques. 3rd Edition. Morgan Kaufmann, 2012, 800 p.

9. Drgas Sz., Dąbrowski A. Generalized cosine similarity in I-vector based automatic speaker recognition systems. In Proc. of the International Conference on Signal Processing: Algorithms, Architectures, Arrangements, and Applications, 2013, pp. 73-77.

10. Bai Zh., Zhang X.-L., Chen J. Cosine Metric Learning for Speaker Verification in the i-Vector Space. In Proc. of the Interspeech Conference, 2018, pp. 1126-1130.

11. Ghosh S., Vijay Girish K.V., Sreenivas T.V. Relationship between Indian Languages Using Long Distance Bigram Language Models. In Proc of the 9'th International Conference on Natural Language Processing, 2011, pp. 104-113.

12. Preliminary recommendations on Corpus Typology. EAGLES – Expert Advisory Group on Language Engineering Standards Guidelines, 1996. Available at: http://www.ilc.cnr.it/EAGLES96/corpustyp/corpustyp.html, 05.11.2019.

13. Comparable Corpora. MT Research Survey Wiki. University of Edinburgh. Available at: http://www.statmt.org/survey/Topic/ComparableCorpora, 05.11.2019.

14. Similarity (State of the art). ACL Wiki for Computational Linguistics. The Association for Computational Linguistics. Available at: https://aclweb.org/aclwiki/Similarity (State_of_the_art), 06.11.2019.

Review

For citations:

Bērziņš A. Usage of i-Vectors for Automated Determination of a Similarity Level between Languages. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2019;31(5):153-164. (In Russ.) https://doi.org/10.15514/ISPRAS-2019-31(5)-12

This work is licensed under a Creative Commons Attribution 4.0 License.

ISSN 2079-8156 (Print)
ISSN 2220-6426 (Online)

Username
Password
	Remember me
Not a user? Register with this site Forgot your password?

User

Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS)