<!DOCTYPE article
PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.4 20190208//EN"
       "JATS-journalpublishing1.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" article-type="research-article" dtd-version="1.4" xml:lang="en">
 <front>
  <journal-meta>
   <journal-id journal-id-type="publisher-id">Scientific Research and Development. Socio-Humanitarian Research and Technology</journal-id>
   <journal-title-group>
    <journal-title xml:lang="en">Scientific Research and Development. Socio-Humanitarian Research and Technology</journal-title>
    <trans-title-group xml:lang="ru">
     <trans-title>Научные исследования и разработки. Социально-гуманитарные исследования и технологии</trans-title>
    </trans-title-group>
   </journal-title-group>
   <issn publication-format="print">2306-1731</issn>
   <issn publication-format="online">2587-912X</issn>
  </journal-meta>
  <article-meta>
   <article-id pub-id-type="publisher-id">21073</article-id>
   <article-id pub-id-type="doi">10.12737/article_5bffa6b5ec8fd7.56294725</article-id>
   <article-categories>
    <subj-group subj-group-type="toc-heading" xml:lang="ru">
     <subject>Образовательные технологии</subject>
    </subj-group>
    <subj-group subj-group-type="toc-heading" xml:lang="en">
     <subject>Educational technologies</subject>
    </subj-group>
    <subj-group>
     <subject>Образовательные технологии</subject>
    </subj-group>
   </article-categories>
   <title-group>
    <article-title xml:lang="en">Author's Approach to the Analysis of the Frequency of Biological Terms in Printed Texts</article-title>
    <trans-title-group xml:lang="ru">
     <trans-title>Авторский подход к анализу частотности биологических терминов в печатных текстах</trans-title>
    </trans-title-group>
   </title-group>
   <contrib-group content-type="authors">
    <contrib contrib-type="author">
     <name-alternatives>
      <name xml:lang="ru">
       <surname>Марышкина</surname>
       <given-names>Таисия Владимировна</given-names>
      </name>
      <name xml:lang="en">
       <surname>Maryshkina</surname>
       <given-names>Taisiya Vladimirovna</given-names>
      </name>
     </name-alternatives>
     <email>taisiya.maryshkina@inbox.ru</email>
     <xref ref-type="aff" rid="aff-1"/>
    </contrib>
    <contrib contrib-type="author">
     <contrib-id contrib-id-type="orcid">https://orcid.org/0000-0003-2337-2280</contrib-id>
     <name-alternatives>
      <name xml:lang="ru">
       <surname>Калижанова</surname>
       <given-names>Анна Николаевна</given-names>
      </name>
      <name xml:lang="en">
       <surname>Kalizhanova</surname>
       <given-names>Anna Nikolaevna</given-names>
      </name>
     </name-alternatives>
     <email>annaanna1802@gmail.com</email>
     <xref ref-type="aff" rid="aff-2"/>
     <xref ref-type="aff" rid="aff-3"/>
    </contrib>
   </contrib-group>
   <aff-alternatives id="aff-1">
    <aff>
     <institution xml:lang="ru">Карагандинский государственный университет им. Е.А. Букетова</institution>
    </aff>
    <aff>
     <institution xml:lang="en">Karaganda State University named after E.A. Buketov</institution>
    </aff>
   </aff-alternatives>
   <aff-alternatives id="aff-2">
    <aff>
     <institution xml:lang="ru">Карагандинский университет &amp;#34;Болашак&amp;#34;</institution>
    </aff>
    <aff>
     <institution xml:lang="en">Karaganda &amp;#34;Bolashak&amp;#34; University</institution>
    </aff>
   </aff-alternatives>
   <aff-alternatives id="aff-3">
    <aff>
     <institution xml:lang="ru">Академия &quot;Bolashaq&quot;</institution>
     <city>Караганда</city>
     <country>Казахстан</country>
    </aff>
    <aff>
     <institution xml:lang="en">Bolashaq Academy</institution>
     <city>Karagandy</city>
     <country>Kazakhstan</country>
    </aff>
   </aff-alternatives>
   <volume>7</volume>
   <issue>4</issue>
   <fpage>11</fpage>
   <lpage>14</lpage>
   <self-uri xlink:href="https://naukaru.ru/en/nauka/article/21073/view">https://naukaru.ru/en/nauka/article/21073/view</self-uri>
   <abstract xml:lang="ru">
    <p>В этой статье описывается авторский подход к анализу частотности биологических терминов в печатных текстах школьных учебников полного курса биологии общеобразовательных школ, позволяющий сократить затраты по времени на обработку текста примерно в десять раз. Авторы статьи анализируют преимущества и недостатки приложений для текстового анализа и голосовой обработки и предлагают свой алгоритм быстрого подсчета ключевых терминов в традиционных текстах. Данное исследование выполнено в рамках грантового проекта КН МОН РК на тему «Создание трехъязычного словаря биологических терминов полного курса биологии с лингвокультурологическим компонентом».</p>
   </abstract>
   <trans-abstract xml:lang="en">
    <p>This article describes the author's approach to the analysis of the frequency of the biological terms in the printed texts of school textbooks of the full course of the biology of Kazakhstan secondary schools, allowed to reduce the cost of time for the processing in ten times. The authors of the article analyze the advantages and disadvantages of applications for text analysis and voice processing and offer their algorithm of fast counting of keywords in traditional texts. This research was carried out within the framework of the grant funding project of the Department of Science and Education of the Republic of Kazakhstan on the theme Creation of Trilingual Dictionary of Biological Terms with a Linguacultural Component.</p>
   </trans-abstract>
   <kwd-group xml:lang="ru">
    <kwd>частотность</kwd>
    <kwd>анализ частотности</kwd>
    <kwd>биологические термины</kwd>
    <kwd>подсчет</kwd>
    <kwd>алгоритм</kwd>
    <kwd>приложения</kwd>
    <kwd>интеллектуальный анализ текста</kwd>
    <kwd>печатные тексты.</kwd>
   </kwd-group>
   <kwd-group xml:lang="en">
    <kwd>frequency</kwd>
    <kwd>frequency analysis</kwd>
    <kwd>biological terms</kwd>
    <kwd>counting</kwd>
    <kwd>algorithm</kwd>
    <kwd>applications</kwd>
    <kwd>text mining</kwd>
    <kwd>printed texts.</kwd>
   </kwd-group>
  </article-meta>
 </front>
 <body>
  <p>IntroductionNowadays, text mining, aimed at processing textual data, still seems one of the vital core methods in numerous philological and corpus linguistics projects [1]. Text frequency analysis deals with words or their clusters used in documents, with the help of which one can identify similarities or difference as well as their relations to other variables of interest in the data mining project [1]. Text mining is used in such areas as linguistics, marketing, computer science, and social studies – wherever researchers use the frequency of lexical units for a better understanding of the internet users’ keywords search [2]. Plenty of digital resources deal with a text frequency analysis and, as a result, provide new ways of creating, processing, and analyzing such data through the computer. However, few methods suit the text mining of the printed sources; so such issue should be raised and solved to simplify the research flow and reduce the time consumption. Research Issues and Objectives Frequency analysis became imperative for the authors of this article and the participants of a grant funding project on the creation of a dictionary of the biological terms with a linguacultural component, designed for Kazakhstani secondary school students, studying biology in English, according to the program “The Trinity of Languages [3].” One of the project stages involved the biological terms frequency analysis in the Kazakhstani textbooks of the entire school course of biology that later would be used for the creation of a significant vocabulary database.Having found only six digital format biology course textbooks for 5th, 7th, and 8th grades, the researchers faced some challenges regarding the rest of all books, existed only in printed hardcopies. So, the calculation of terms frequency turned into quite a laborand time-consuming work. Although such software as Acrobat Reader has the function of text recognition, the process of one textbooks scanning took about one hour and a half. Moreover, Kazakh texts were poorly recognized as well as Russian and English pages scanned in somewhat sufficient quality. The researchers tried to count the words manually by looking through the books; but, it often led to mistakes caused by the attention distraction. Thus, the need for creating a comfortable and people-friendly method of implementing the qualitativeanalysis of any printed texts built the foundation for the following study. Research Methods and Variables The quantitative and qualitative empirical research methods were applied in the study on the designing the brand-new and comfortable way of printed text mining was applied to the 7th, 8th, 9th, 10th, and 11th grades textbooks of natural course for Kazakhstani secondary schools. Free ten tools for text frequency analysis, both online and offline, have been examined and compared regarding their possibilities of the words and word combinations frequency.</p>
 </body>
 <back>
  <ref-list>
   <ref id="B1">
    <label>1.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">www.statsoft.com, n.d. Text Mining (Big Data, Unstructured Data) [WWW Document]. Support Vector Machines (SVM). URL: http://www.statsoft.com/Textbook/Text-Mining#overview (accessed 5.17.18).</mixed-citation>
     <mixed-citation xml:lang="en">www.statsoft.com, n.d. Text Mining (Big Data, Unstructured Data) [WWW Document]. Support Vector Machines (SVM). URL: http://www.statsoft.com/Textbook/Text-Mining#overview (accessed 5.17.18).</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B2">
    <label>2.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Kobayashi V.B., Berkers H.A., Mol S.T. 2017. Text Mining in Organizational Research [WWW Document]. Philosophy of the Social Sciences. URL: http://journals.sagepub.com/doi/full/10.1177/1094428117722619 (accessed 5.17.18).</mixed-citation>
     <mixed-citation xml:lang="en">Kobayashi V.B., Berkers H.A., Mol S.T. 2017. Text Mining in Organizational Research [WWW Document]. Philosophy of the Social Sciences. URL: http://journals.sagepub.com/doi/full/10.1177/1094428117722619 (accessed 5.17.18).</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B3">
    <label>3.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Официальный сайт Парламента Республики Казахстан [WWWDocument], n.d. [WWWDocument]. URL: http://www.parlam.kz/ru/presidend-speech/5 (accessed 5.16.18).</mixed-citation>
     <mixed-citation xml:lang="en">Oficial'nyy sayt Parlamenta Respubliki Kazahstan [WWWDocument], n.d. [WWWDocument]. URL: http://www.parlam.kz/ru/presidend-speech/5 (accessed 5.16.18).</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B4">
    <label>4.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Семантический анализ текста онлайн, seo-анализ текста / Адвего [WWW Document], n.d. [WWW Document]. Адвего. URL: https://advego.com/text/seo/ (accessed 5.17.18).</mixed-citation>
     <mixed-citation xml:lang="en">Semanticheskiy analiz teksta onlayn, seo-analiz teksta / Advego [WWW Document], n.d. [WWW Document]. Advego. URL: https://advego.com/text/seo/ (accessed 5.17.18).</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B5">
    <label>5.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Семантический нализ текста онлайн [WWW Document], n.d. [WWW Document]. Семантический анализ текста онлайн / istio.com - белое SEO. URL: https://istio.com/rus/text/analyz/ (accessed 5.17.18).</mixed-citation>
     <mixed-citation xml:lang="en">Semanticheskiy naliz teksta onlayn [WWW Document], n.d. [WWW Document]. Semanticheskiy analiz teksta onlayn / istio.com - beloe SEO. URL: https://istio.com/rus/text/analyz/ (accessed 5.17.18).</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B6">
    <label>6.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">wordTabulator [WWW Document], n.d. [WWW Document]. SourceForge. URL: http://wordtabulator.sourceforge.net/ (accessed 5.17.18).</mixed-citation>
     <mixed-citation xml:lang="en">wordTabulator [WWW Document], n.d. [WWW Document]. SourceForge. URL: http://wordtabulator.sourceforge.net/ (accessed 5.17.18).</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B7">
    <label>7.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Simagin, A., n.d. Семантический анализ текста [WWW Document]. «Majento» - Продвижение Web-проектов. URL: http://www.majento.ru/index.php?page=seo-analize/text-semantic/index (accessed 5.17.18).</mixed-citation>
     <mixed-citation xml:lang="en">Simagin, A., n.d. Semanticheskiy analiz teksta [WWW Document]. «Majento» - Prodvizhenie Web-proektov. URL: http://www.majento.ru/index.php?page=seo-analize/text-semantic/index (accessed 5.17.18).</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B8">
    <label>8.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">SEO [WWW Document], n.d. [WWW Document]. 1Y.ru. URL: http://1y.ru/text.php (accessed 5.17.18).</mixed-citation>
     <mixed-citation xml:lang="en">SEO [WWW Document], n.d. [WWW Document]. 1Y.ru. URL: http://1y.ru/text.php (accessed 5.17.18).</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B9">
    <label>9.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Анализ текста по закону Ципфа [WWW Document], n.d. [WWW Document]. PR-CY. URL: http://pr-cy.ru/zypfa/text (accessed 5.17.18).</mixed-citation>
     <mixed-citation xml:lang="en">Analiz teksta po zakonu Cipfa [WWW Document], n.d. [WWW Document]. PR-CY. URL: http://pr-cy.ru/zypfa/text (accessed 5.17.18).</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B10">
    <label>10.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Биржа копирайтинга, проверка текста на уникальность [WWW Document], n.d. [WWW Document]. Text.ru. URL: https://text.ru/ (accessed 5.17.18).</mixed-citation>
     <mixed-citation xml:lang="en">Birzha kopiraytinga, proverka teksta na unikal'nost' [WWW Document], n.d. [WWW Document]. Text.ru. URL: https://text.ru/ (accessed 5.17.18).</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B11">
    <label>11.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">[WWW Document], n.d. [WWW Document]. Семантический анализ текста онлайн. URL: https://itop.media/tools.php?i=semantics (accessed 5.17.18).</mixed-citation>
     <mixed-citation xml:lang="en">[WWW Document], n.d. [WWW Document]. Semanticheskiy analiz teksta onlayn. URL: https://itop.media/tools.php?i=semantics (accessed 5.17.18).</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B12">
    <label>12.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">TextAnalyzer: Универсальный анализатор текстов [WWW Document], n.d. [WWW Document]. Text Analyzer - Универсальный анализатор текста. URL: https://www.textanalyzer.ru/ (accessed 5.17.18).</mixed-citation>
     <mixed-citation xml:lang="en">TextAnalyzer: Universal'nyy analizator tekstov [WWW Document], n.d. [WWW Document]. Text Analyzer - Universal'nyy analizator teksta. URL: https://www.textanalyzer.ru/ (accessed 5.17.18).</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B13">
    <label>13.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Wladm, 2018. Lit Frequency Meter [WWW Document]. Software Informer. URL: http://litfrequencymeter.software.informer. com/5.2/ (accessed 5.17.18).</mixed-citation>
     <mixed-citation xml:lang="en">Wladm, 2018. Lit Frequency Meter [WWW Document]. Software Informer. URL: http://litfrequencymeter.software.informer. com/5.2/ (accessed 5.17.18).</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B14">
    <label>14.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Speech to Text Online Notepad. Free [WWW Document], n.d. [WWW Document]. Speechnotes. URL: https://speechnotes.co/ (accessed 5.17.18).</mixed-citation>
     <mixed-citation xml:lang="en">Speech to Text Online Notepad. Free [WWW Document], n.d. [WWW Document]. Speechnotes. URL: https://speechnotes.co/ (accessed 5.17.18).</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B15">
    <label>15.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">ListNote Speech-to-Text Notes - Apps on Google Play [WWW Document], n.d. [WWW Document]. Google. URL: https:// play.google.com/store/apps/details?id=com.khymaera.android.listnotefree&amp;hl=en (accessed 5.17.18).</mixed-citation>
     <mixed-citation xml:lang="en">ListNote Speech-to-Text Notes - Apps on Google Play [WWW Document], n.d. [WWW Document]. Google. URL: https:// play.google.com/store/apps/details?id=com.khymaera.android.listnotefree&amp;hl=en (accessed 5.17.18).</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B16">
    <label>16.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Speech to Text Translator TTS - Apps on Google Play [WWW Document], n.d. [WWW Document]. Google. URL: https:// play.google.com/store/apps/details?id=com.fsm.speech2text&amp;hl=en_US (accessed 5.17.18).</mixed-citation>
     <mixed-citation xml:lang="en">Speech to Text Translator TTS - Apps on Google Play [WWW Document], n.d. [WWW Document]. Google. URL: https:// play.google.com/store/apps/details?id=com.fsm.speech2text&amp;hl=en_US (accessed 5.17.18).</mixed-citation>
    </citation-alternatives>
   </ref>
   <ref id="B17">
    <label>17.</label>
    <citation-alternatives>
     <mixed-citation xml:lang="ru">Voice Text - Apps on Google Play [WWW Document], n.d. [WWW Document]. Google. URL: https://play.google.com/store/apps/details?id=com.matthew.rice.voice.text&amp;hl=en (accessed 5.17.18).</mixed-citation>
     <mixed-citation xml:lang="en">Voice Text - Apps on Google Play [WWW Document], n.d. [WWW Document]. Google. URL: https://play.google.com/store/apps/details?id=com.matthew.rice.voice.text&amp;hl=en (accessed 5.17.18).</mixed-citation>
    </citation-alternatives>
   </ref>
  </ref-list>
 </back>
</article>
