Getting started with Talend Component Kit

Talend Component Kit is a Java framework designed to simplify the development of components at two levels:

  • The Runtime, that injects the specific component code into a job or pipeline. The framework helps unifying as much as possible the code required to run in Data Integration (DI) and BEAM environments.

  • The Graphical interface. The framework helps unifying the code required to render the component in a browser or in the Eclipse-based Talend Studio (SWT).

Most part of the development happens as a Maven or Gradle project and requires a dedicated tool such as IntelliJ.

The Component Kit is made of:

  • A Starter, that is a graphical interface allowing you to define the skeleton of your development project.

  • APIs to implement components UI and runtime.

  • Development tools: Maven and Gradle wrappers, validation rules, packaging, Web preview, etc.

  • A testing kit based on JUnit 4 and 5.

By using this tooling in a development environment, you can start creating components as described below.

Talend Component Kit methodology

Developing new components using the Component Kit framework includes:

  1. Creating a project using the starter or the Talend IntelliJ plugin. This step allows to build the skeleton of the project. It consists in:

    1. Defining the general configuration model for each component in your project.

    2. Generating and downloading the project archive from the starter.

    3. Compiling the project.

  2. Importing the compiled project in your IDE. This step is not required if you have generated the project using the IntelliJ plugin.

  3. Implementing the components, including:

    1. Registering the components by specifying their metadata: family, categories, version, icon, type and name.

    2. Defining the layout and configurable part of the components.

    3. Defining the execution logic of the components, also called runtime.

  4. Testing the components.

  5. Deploying the components to Talend Studio or Cloud applications.

Optionally, you can use services. Services are predefined or user-defined configurations that can be reused in several components.

Methodology

Component types

There are four types of components, each type coming with its specificities, especially on the runtime side.

  • Input components: Retrieve the data to process from a defined source. An input component is made of:

    • The execution logic of the component, represented by a Mapper or an Emitter class.

    • The source logic of the component, represented by a Source class.

    • The layout of the component and the configuration that the end-user will need to provide when using the component, defined by a Configuration class. All input components must have a dataset specified in their configuration, and every dataset must use a datastore.

  • Processors: Process and transform the data. A processor is made of:

    • The execution logic of the component, describing how to process each records or batches of records it receives. It also describes how to pass records to its output connections. This logic is defined in a Processor class.

    • The layout of the component and the configuration that the end-user will need to provide when using the component, defined by a Configuration class.

  • Output components: Send the processed data to a defined destination. An output component is made of:

    • The execution logic of the component, describing how to process each records or batches of records it receives. This logic is defined in an Output class. Unlike processors, output components are the last components of the execution and return no data.

    • The layout of the component and the configuration that the end-user will need to provide when using the component, defined by a Configuration class. All input components must have a dataset specified in their configuration, and every dataset must use a datastore.

  • Standalone components: Make a call to the service or run a query on the database. A standalone component is made of:

    • The execution logic of the component, represented by a DriverRunner class.

    • The layout of the component and the configuration that the end-user will need to provide when using the component, defined by a Configuration class. All input components must have a datastore or dataset specified in their configuration, and every dataset must use a datastore.

The following example shows the different classes of an input components in a multi-component development project:

Input
Scroll to top