ІНФОРМАЦІЙНО-КОМУНІКАЦІЙНІ

Purpose. During the development of railway ontologies, it is necessary to take into account both the data of information systems and regulatory support to check their consistency. To do this, data integration is performed. The purpose of the work is to formalize the methods for integrating heterogeneous sources of information and ontology formation. Methodology. Constructive-synthesizing modelling of ontology formation and its resources was developed. Findings. Ontology formation formalization has been performed, which allows expanding the possibilities of automating the integration and coordination of data using ontologies. In the future, it is planned to expand the structural system for the formation of ontologies based on textual sources of railway regulatory documentation and information systems. Originality. The authors laid the foundations of using constructive-synthesizing modelling in the railway transport ontological domain to form the structure and data of the railway train speed restriction warning tables (database and csv format), their transformation into a common tabular format, vocabulary, rules and ontology individuals, as well as ontology population. Ontology learning methods have been developed to integrate data from heterogeneous sources. Practical value. The developed methods make it possible to integrate heterogeneous data sources (the structure of the table of the railway train management rules, the form and application for issuing a warning), which are railway domain-specific. It allows forming an ontology from its data sources (database and csv formats) to schema and individuals. Integration and consistency of information system data and regulatory documentation is one of the aspects of increasing the level of train traffic safety.


Introduction
In Europe, transport ontologies [15] are developed given the information support evolution and heterogeneous databases integration without changing them. Development is performed using complementary software tools. Railway transport data transformation into an ontology is only partially applied. In [3], the Rail-TopoModel UML model was transformed into an ontology schema. When developing the Rail Core Ontology [15], OpenRefine is used to transform railway train timetables into ontology individuals.
Previously, we developed a railway track ontology formation procedure. This paper considers the railway ontology formation procedure formalization by means and methods of constructive-synthesizing modelling (CSM) [16].

Problem statement and purpose
Automating the formation of ontologies makes it possible to facilitate the laborious ontology development process and is subject to ontology learning (OL).
OL is performed using tabular, textual sources and a variety of models. OL tools are based on logic rules, machine learning, statistical methods, etc.
When developing railway ontological support, it is necessary to consider both information system 59 data and regulatory support to check their consistency. According to the review [11], insufficient attention is paid to OL's data integration.
The purpose of the work is to formalize heterogeneous information sources integration methods of ontology formation. As an example, the railway track information support is considered in terms of the train speed restriction in connection with its technical condition.

Analysis of recent research and publications
Automation of ontology development implies the automation of schema development and data transformation into ontology individuals.
For the ontology scheme, the following methods are used: transformation of the XML schema, for example, using XSD2OWL [6] (as well as UML [14], etc.) into an ontology schema; using a controlled language for ontology development, for example, OntoDL in Onto2OWL [8]; automated semantic annotation of documents, for example, using text2onto [4].
For our work, tabular data structure mapping is more relevant. In the case of the ontology schema generation using the SQL-DDL database schema, mapping rules are used. The tables are mapped to the corresponding ontology classes [2]. Other database-based ontology generation tools are presented in the review [11].
The formalization of ontology scheme formation is also carried out based on ontology pattern language (OPL). As part of the approach, ontology design patterns have been developed, and the possibilities for combining them are demonstrated using OPL tools. To represent the ontology pattern language, one uses grammar [13] and OPL languages [12] based on derivation rules.
Automation of data transformation into ontology instances is performed by the following means: data wrangling, such as in the Karma [10] workbench and OpenRefine; semantic role labelling, for example in Inception [9]; -Virtual Knowledge Graph System, for example in Ontop [21].
Tools such as «RDB to RDF Mapping Language» are used to map table data onto an ontology schema, followed by ontology population to integrate data and check their consistency.
Another way to automate the development of ontology formation using several software tools is platforms for their integration. Integration can be understood as the connection of services in applications based on ontologies, where RabbitMQ and Redis messages are used [15].
For our work, the integration of ontology development applications in the sense of mapping heterogeneous files is more relevant, for example, as in OntoPop [1], where tags for annotation and ontology classes are mapped. First, documents are annotated, rules are developed in the LangText language [5] for mapping tags and ontology, and then the ontology is populated. Ontology individuals are retrieved from the text.
In other platforms, like [7], Java Patterns Engine Annotation rules are used to map annotations to ontology instances as part of text2onto [4].

Methodology
CSM is based on formal grammar and is used to formalize the formation of structures and constructive processes of various natures. The basis is the generic constructor, which is presented in [16].
Here, the CSM is used to formalize the procedure for developing a railway track ontology using the example of a speed restriction warning due to its defects.
The paper presents only the specialization, interpretation and concretization of the generative constructor for the formation of the speed restriction warning database table. In [18] the following constructors are available: formation of railway train management rules (TMR) [20] transformation of the ontology vocabulary  and TMR DU-61 warning table structure into ontology rules; transformation of schema and instances into ontology.

Findings
Information support of the railway track in terms of defectsa speed restriction warning includes ( Figure 1): an application for issuing warnings indicating the location and speed restriction; the data structure description of the TMR DU-61 warning; warning form; The development of an ontology of data sources for the DU-61 warning involves the following steps: filling in the database form DU-61 table according to the request of the track master using SQLite; transformation of the database DU-61 form table into a generic representation using OpenRefine; filling in csv tables according to the model of tabular knowledge representation [16] and the TMR DU-61 table;  converting the TMR DU-61 table into a csv  metadata table using a text editor; converting csv tables into a generic representation using OpenRefine; transformation of the generic format TMR DU-61 from metadata tables and the tabular knowledge representation model into the ontology vocabulary using the Open Refine RDF extension; transformation of the DU-61 form table of the generic format and the ontology vocabulary into ontology instances (ontology Аbox) using the Open Refine RDF extension; transformation of ontology vocabulary into ontology rules (ontology Tbox) using Protégé; integration of schema and ontology instances using Protégé. Generative constructor for forming speed restriction warning database table Constructors for each stage of development are developed that have Protégé, OpenRefine, Tabula and SQLite actors, as well as input and output data ( Figure 1). The procedure for developing constructors is demonstrated. For the generative constructor, the inputs are the application, the TMR, and the tabular knowledge representation model. The inputs of the transformation constructor and the outputs of the generative and transformation constructors are the constructs that are obtained through inference.
The purpose of the constructor is to generate a speed restriction warning database table.
Consider the specialization of the generalized constructor (GC) based on the constructive-synthesizing approach: , , , , information support for the construction of the SC may include the semantics of concepts and operations, derivation rules, restrictions, and initial and termination conditions. includes the following concepts: table, table name, table key column (key), tuple, number of columns (b), number of rows ( f ), attribute value (LIT), sequence relation ( ), as well as [5] operations of full (|| ) and partial (| ) derivation, constrIndex, constrIndexPart strings. includes the following constraint: the partial derivation is performed considering the derivation relation attribute ( t  ), if t = 0, then the rule is not available.
The derivation rules are defined when the constructor is instantiated.

Originality and practical value
The basis for automating the formation of the ontology of the railway domain by the constructivesynthesizing method has been laid.
The authors laid the foundations of using constructive-synthesizing modelling in the railway transport ontological domain to form the structure and data of the railway train speed restriction warning tables (database and csv format), their transformation into a common tabular format, vocabulary, rules and ontology individuals. As well as ontology population.
Ontology learning methods have been developed to integrate data from heterogeneous sources (the structure of the table DU-61 TMR, the form and application for issuing a warning), and is also railway-oriented. It allows forming an ontology from its data sources (database format and csv) to schema and instances.
Integration and consistency checking of data from information systems and regulatory documentation is one of the aspects of increasing the train traffic safety level.

Conclusions
The formalization of the ontology formation has been performed, which allows expanding the possibilities of automating the data integration and checking its consistency using ontologies.
In the future, it is planned to expand the structural system for the formation of ontologies based on textual sources of regulatory documentation and information systems of the railway.